Introduction

In the marine environment, most adults of invertebrate species are benthic and sedentary (Mileikovsky 1971). At each free spawning event, they produce new pelagic larvae, that can spatially disperse from their source locations over broad geographical ranges (Siegel et al. 2003), and influence the demography and genetic structure of previously settled populations (Moberg and Burton 2000; Flowers et al. 2002; D’Aloia et al. 2015). Connectivity that occurs between populations due to dispersion, is thus involved in the key processes of population dynamics, from short-term demography to long-term adaptive potential and evolution. In the context of environmental change and increasing anthropogenic pressures, understanding how populations are connected, as well as their potential to recover after disturbances, is a major issue of species and resources conservation (Pineda, 2000; Burgess et al. 2014).

The potential for larval dispersion has long been expected to correlate with pelagic larval duration (PLD) (Siegel et al. 2003), leading to the paradigm that PLD was the key factor driving the genetic structure of populations (Hellberg 1996; Gilg and Hilbish 2003). This has recently evolved, thanks to an increasing number of studies showing that PLD and genetic structure do not always correlate as expected (Selkoe and Toonen 2011; Faurby and Barber 2012; Foster et al. 2012; Iacchei et al. 2013). Several factors, acting at various spatial and temporal scales, may indeed strongly impact population dynamics, such as life history traits, larval behavior, biotic interactions, ocean circulation or the availability of suitable habitats (Guizien et al. 2006; Pineda et al. 2007; Butler et al. 2011, Pascual et al. 2017). As a result, the full potential range of dispersion is actually not necessarily achieved, and thus, population differentiations can be detected at a smaller spatial scale (Iacchei et al. 2013).

The echinoderm sea urchin, Paracentrotus lividus, is a benthic organism with a pelagic larval stage. The species is widely distributed all around the Mediterranean Sea and along the north-eastern Atlantic coast, and strongly contributes to the regulation of the macroalgal community structure (Boudouresque and Verlaque 2013). P. lividus live in the infralittoral zone, mainly within the first meter of water, but can be found down to a depth of 20 m (Hereu et al. 2004). For decades, it has been intensely harvested for food consumption, and the sharp, global population decline and collapse that has been observed (http://www.fao.org/fishery/statistics/en) has led to growing concerns about its sustainability as a natural resource (Barnes and Crook 2001; Guidetti et al. 2004).

Whilst adults are benthic and sedentary, their larvae remain within the water column for approximately 4 weeks before settling (Fenaux et al. 1985; Pedrotti 1993). As previously reported in other echinoids (Ebert 1983; Sala et al. 1998), P. lividus undergo clear inter-annual recruitment variability (Sala et al. 1998; Hereu et al. 2004; Tomas et al. 2004), even at small geographic scales (i.e. <hundreds of meters) (Hereu et al. 2004).

Over its whole distribution range, the species presents two main genetic discontinuities, between the Atlantic and Mediterranean Sea basins (Duran et al. 2004; Maltagliati et al. 2010; Penant et al. 2013), and between the Adriatic Sea and the rest of the Mediterranean basin (Maltagliati et al. 2010; Penant et al. 2013; Paterno et al. 2017). Recently, a north-to-south differentiation within the western Mediterranean basin has been revealed following analysis of more than 1000 polymorphic loci (Paterno et al. 2017). At small geographic scales, analysis of the genetic structure has yielded contrasting results. For example, using mitochondrial markers, Penant et al. (2013) detected high numbers of significant genetic differentiations among populations within all basins (i.e. Atlantic, Western and Eastern Mediterranean Sea and the Adriatic Sea), although Calderón et al. (2012) did not detect differentiation in the Western Mediterranean Sea. More recently, using a high number of polymorphic loci, Paterno et al. (2017) did not detect any differentiation within the Adriatic basin. They suggest that the discrepancy between the results obtained with mitochondrial and nuclear markers might be the result of major contributions by a small number of females to the next generations, as well as intraspecific incompatibilities between male and female gametes. This is also suspected to occur in P. lividus (Calderón et al. 2009b).

Temporal genetic differentiation between cohorts of recruits from successive generations, along the south-eastern coast of Spain, could be the evidence that underlies the fluctuation of genetic diversity occurring between years. However, such variability was not found at only one location (Calderón et al. 2009b; Calderón and Turon 2010a). Moreover, it is interesting to note that significant differentiations between temporal cohorts at one location, have been repeatedly reported in various highly dispersive marine species (Johnson and Black 1982; Watts et al. 1990; Pujolar et al. 2006; Hogan et al. 2010). This would suggest that, fine-scale genetic patchiness, as yet unreported, might not be a rare event in P. lividus.

Thus, drawing a reliable picture of connectivity at a small scale (i.e. within the larval dispersal range), based on traditional FST measurements only, appears challenging. By performing genetic relatedness and kinship analysis, several authors have been able to assess the potential of larval dispersion and small-scale connectivity (Veliz et al. 2006; Schunter et al. 2014), and thus demonstrate a relationship between chaotic genetic patchiness and kin aggregation (Iacchei et al. 2013; Aglieri et al. 2014; Selwyn et al. 2016), suggesting that in all likelihood, larval pools could disperse cohesively (Iacchei et al. 2013; Eldon et al. 2016; Selwyn et al. 2016). However, further experimental studies based on the analysis of successive temporal cohorts are required in order to test this assumption, as well as assessing gene flow between two spawning seasons (Veliz et al. 2006; Iacchei et al. 2013; Aglieri et al. 2014).

With this in mind, our objectives were to test the hypothesis that (1) genetically differentiated populations could be observed at a small scale for each temporal cohort and that kin aggregation would contribute to such differentiations, and (2) that it is possible to depict the pattern of connectivity at a small spatial scale. In this study, we used a set of 12 microsatellite markers, to monitor over a three-year period, the genetic variability of P. lividus populations at 11 localities, within a focal region of the south-eastern French Mediterranean coast, where sea urchin populations are under fishing pressure.

Materials and methods

Study species, study area and sampling

A total of 1370 Paracentrotus lividus sea urchins were hand harvested by snorkeling during the French legal period (i.e. spanning a period from the 16th of April until the 31st of October: http://www.dirm.mediterranee.developpement-durable.gouv.fr/IMG/pdf/arrete-PSM-27octobre2008_cle5f77e2.pdf) in 2010, 2011 and 2012, along the south-eastern Mediterranean coast of France at 11 locations between Carry-le-Rouet and Villefranche-sur-Mer. Sampling locations are presented in Fig. 1.

Fig. 1
figure 1

Chart of the studied area. At each location, sea urchins of three cohorts were harvested: adults and recruits-of-the-year 2011 and 2012, denominated as in the following example: Car2010a, Car2011r or Car2012r where a and r stand for adults and recruits, respectively

In 2010, adults with a test diameter greater than 4 cm were collected. Though we have no obvious information about the year of their recruitment, we assumed that most of them were at least 3 years old (Calderón et al. 2009b). Also, as we did not sacrifice animals, genders were not determined. In 2011 and 2012, small sea urchins with a test diameter of less than 1 cm were harvested, and were assumed to be recruits-of-the-year (Calderón et al. 2009b).

We use the term “cohort” to refer to the set of recruits-of-the-year sampled in 2011 and 2012, as well as, for better clarity, to the set of adults sampled in 2010. We use the term “cohort-by-location” to refer to the set of individuals sampled in a specific location either in 2010, 2011 or 2012. We thus investigated a total of 33 independent cohort-by-locations, denominated as follows: “location”,”year”,”a or r”, where a and r stand for adults and recruits, respectively (e.g. the three cohort-by-locations collected at Carry-le-Rouet are denominated: Car2010a, Car2011r and Car2012r). We use the term “population” to refer to all of the individuals collected at a specific location, that is considering the three cohorts.

A few spines of each collected individual were sampled, then sea urchins were returned to their initial locations. Spines were conserved in absolute ethanol and kept on ice during the sampling campaign, then at −80 °C until DNA extraction.

Bioinformatic searching of microsatellites and genotyping

Sequences containing microsatellites (or simple sequence repeats – SSR) were searched with a Biopython tool, in a genomic database dedicated to Paracentrotus lividus (http://octopus.obs-vlfr.fr/blast/blast3.php). This was accomplished with a BLASTn function available online (http://octopus.obs-vlfr.fr/blast/oursin/blast_oursin.php), that searched for all possible di-, tri- and tetra-nucleotide motifs, repeated at least 6 times. Query sequences were filtered for low complexity regions. The threshold e-value was set at 10 and we used a Blosum62 matrix of substitutions. The Blast search resulted in 1916 non-redundant sequences. We also manually searched for complementary SSR in P. lividus sequences available in Genbank. We arbitrarily selected 55 sequences in which microsatellite flanking regions were long enough, did not contain more than 5 nucleotide stretches or another SSR, and were composed of a minimum of 30% GC. The putative location of each microsatellite within either an untranslated region (UTR), intron or exon was assessed by searching for significant similarities between the selected sequence and those from the NCBI database, or the relevant P. lividus specific transcriptomic and genomic database, using both BLASTn and BLASTx functions (http://octopus.obs-vlfr.fr/blast/oursin/blast_oursin.php). When SSR could not precisely be located within a gene (i.e. ms18, see Suppl. data1), we further investigated the genome database to verify the intergenic location, taking care not to exclude those located within an intron. In total, 55 primer pairs were designed from genomic sequences using the Primer3Plus software with default parameters (Tm set at 60 °C). The primer pairs were checked for specificity then tested in PCR reactions as described below, with 30 P. lividus DNA samples to assess the quality of the amplified locus.

Genomic DNA was purified using the DNeasy Blood and Tissue Kit (Qiagen), following manufacturer’s instructions. Concentrations were determined by measuring the absorbance at 260 nm using a Nanodrop-1000, then adjusted to 20 ng/µL. DNA integrity of randomly chosen samples was assessed on an agarose gel at 1%. PCR was performed separately for each locus in a final volume of 25 µL containing a final concentration of 1 unit of 5PRIME Taq DNA polymerase, 200 nM of forward primer, 200 nM of reverse primer, 200 µM dNTP, 1 × 5PRIME Taq DNA buffer (1.5 mM MgCl2, or alternatively, a concentration of 2.5 mM was tested by adding 1 mM MgCl2), 20 ng of DNA template and deionized water up to a total volume of 25 μL. PCR conditions included an initial denaturation at 94 °C for 5 min, 38 cycles for 50 s at 94 °C for denaturation, 50 s at 58 °C for annealing and 50 s at 72 °C for extension, followed by a final extension for 10 min at 72 °C. PCR fragments were visualized with electrophoresis on a 3% agarose gel.

All of the tested primer pairs gave a positive signal on electrophoretic gels, of which 8 were selected based on their amplification success, the number of reliable bands, and detected polymorphism on the gel. We then genotyped 46 adult individuals to assess the allele diversity of each selected locus using fluorescently labeled forward primer and performed PCR amplifications with the same conditions as described above, except that the reaction mixture contained 30 nM of labeled forward primer and 170 nM of unlabeled forward primer instead of 200 nM.

The 8 microsatellite loci described in this study (Suppl. data1), were combined with 4 microsatellite loci that had previously been described in the genome of P. lividus by Calderón et al. (2009a): namely Pl_B, Pl_C, Pl_28 and Pl_T. We thus used a total of 12 microsatellite loci.

Multiplex Manager software was used to define putative primer-pair combinations for PCR multiplexing. Three multiplex-PCR were selected that gave similar results as the control simplex-PCR conditions as described above (Suppl. data1).

Amplified fragments were resolved using an ABI 3730XL Genetic Analyzer (Applied Biosystems, Carlsbad, USA) by Genoscreen (Lille, France). Samples were run together with the LIZ600 DNA ladder. Peaks were sized with the STRand v2.4.59 Analysis software (http://www.vgl.ucdavis.edu). In the case of MS18, we investigated a di-nucleotide repeat, when considering size polymorphism, resulting from both tetra- and di-nucleotide repeat motifs.

Genetic diversity analysis

The level of polymorphism was calculated for each locus in each cohort-by-location. FSTAT v2.9.3.2 software was used to determine allele diversity (NA), allelic richness (AR), and observed and expected heterozygosity (HO and HE respectively) (Goudet 1995).

Linkage disequilibrium was assessed for each pair of loci within each cohort-by-location using ARLEQUIN v3.5.1.2 (Excoffier and Lischer 2010). The corrected significance threshold for multiple tests was set using the Benjamini-Hochberg correction procedure (Benjamini and Hochberg 1995). Among the significantly linked loci, we checked their occurrence in the 33 cohort-by-locations.

The genotyping data was manually checked: individuals with too much missing data were removed from the dataset in order that missing data represented less than 5% of the full data for each locus within each population. Finally, 1059 individuals were considered. Then, the dataset was assessed for scoring errors, large allele drop-out and null alleles using MICRO-CHECKER 2.2.3 (Van Oosterhout et al. 2004). As null alleles were suspected, we used the software to estimate their frequencies with the correction algorithm of Van Oosterhout et al. (2004), and to generate a second dataset adjusted for the frequency of null alleles, for each locus within each cohort-by-location. The pervasiveness of null alleles was then assessed by testing both datasets in subsequent analyses.

We calculated Weir and Cockerham (1984) estimates of FIS for the global population and within each cohort-by-location by performing 10,000 genotype randomizations among samples with FSTAT (Goudet 1995). Then, significant deviation from the Hardy-Weinberg equilibrium (HWE) was tested at both these scales using the randomization procedure (100,000 steps in Markov Chain and 1,000,000 dememorization steps) implemented in ARLEQUIN (Excoffier and Lischer 2010).

Population structure

To assess the patterns of spatial and temporal genetic variations among populations and cohorts, we calculated pairwise FST values using FreeNA software, with and without null alleles (Chapuis and Estoup, 2007). Ninety-five percent confidence intervals were obtained using 50,000 bootstrap iterations. Estimates of FST obtained with and without null alleles were compared with a paired two-tailed t-test. Then, we calculated pairwise FST values, using the estimator θ of Weir and Cockerham (1984) and the corresponding p-values, and tested for allele-frequency heterogeneity using an exact test with ARLEQUIN (Excoffier and Lischer 2010). The corrected significance threshold for multiple tests was set using the Benjamini-Hochberg (B-H) correction procedure (Benjamini and Hochberg 1995).

Isolation by distance was assessed by testing the relationship between Ln(FST/(1−FST) and minimal marine distances between pairs of locations for each cohort, using a Mantel Test implemented in GeneAlEx v6.501 (Peakall and Smouse 2006), with 999 permutations. Geographic distances were measured using Google Maps.

We assessed potential selection acting on microsatellite loci using LOSITAN (Antao et al. 2008), considering “neutral mean FST” and “forced mean FST”, for both IAM or SMM assumptions. The false discovery rate was set at 0.01.

Additionally, we calculated the genetic differentiation within cohort-by-locations (local) using GESTE v2.0 software (Foll and Gaggiotti 2006).

Relatedness and relationship analysis

Relatedness

In an initial analysis, we evaluated the level of genetic relatedness (i.e. r-coefficient) of pairwise individuals with ML-relate software (Kalinowski et al. 2006). As this software can accommodate for the presence of null alleles, we performed two independent runs, either with or without null alleles being taken into account. We tested the influence of null alleles on the r-coefficient between individuals with a two-tailed t-test. We finally performed a conservative relatedness analysis by excluding null alleles as recommended by Wagner et al. (2006).

We then calculated the mean observed r-coefficients within each cohort-by-location and each cohort-by-location pairs (i.e. two cohort-by-locations combined), and tested whether an equal or higher than observed r-coefficient could have been expected by chance alone. With this in mind, we used GraphPad software to fit the observed distribution of individual pair-wise r-coefficients to an exponential curve describing a theoretical distribution of r-coefficients (r² = 0.995) and verified that observed and theoretical mean and variance were similar. We then performed a Monte Carlo Markov Chain (MCMC) randomization of r-coefficients (i.e. equivalent to 1000 permutations of individuals across the whole set of populations) to calculate the mean r-coefficient expected by chance alone with a random mating hypothesis. The observed within and pairwise cohort-by-locations r-coefficients were compared to the simulated r-coefficients via the mean of a two-tailed t-test. The corrected significance threshold for multiple tests was set using the Benjamini-Hochberg (B-H) correction procedure (Benjamini and Hochberg 1995).

Relationship

The relationship between pairs of individuals within the whole dataset was investigated using two contrasting and complementary software packages: COLONY v2.0.6.1 (Jones and Wang 2010) and ML-relate.

We first used COLONY to assess sib-pairs (i.e. half-sibs: HS and full-sibs: FS) within our dataset and their occurrence after 5 independent runs. In the parameter settings, we specified that both males and females were polygamous, since fertilization is external, and chose a mating system with inbreeding. We chose a full-likelihood approach, with medium precision and medium run length. We let COLONY update allele frequencies and scale sib-ship results, but did not select any sib-ships prior to analysis as we had no information about individual pair-wise relationships. We set the mistyping error rate to 1% for each marker. We also generated 5 permutated databases (using the Microsoft Excel PopTools add-in) and conducted a similar analysis to determine the possibility of identifying identical sib-pairs by chance alone. COLONY associates a posterior probability to each sib-pair relationship. In a conservative approach, only sib-pairs with associated posterior probability higher or equal to 0.99 were considered.

ML-relate was used to check the sib-pair assignments obtained with COLONY. ML-relate software has the advantage of being insensitive to null alleles, as well as providing an estimate of the existence of related rather than non-related individuals. Specifically, ML-relate indicates the relationship (R) with the highest likelihood LnL(R) and specifies how much lower the log-likelihood Delta Ln(L) are for the other relationships. We then statistically assessed the reliability of our results by performing a “specific hypothesis test of relationship”, implemented in ML-relate, that attempted to exclude an alternative relationship (here “Unrelated”) by performing 50,000 simulations. We considered that the relationship was much more likely than “Unrelated” when p-values were inferior to 0.05 (Kalinowski et al. 2006).

The relationships between recruits are expected to be either “unrelated” (U), “full-sib” (FS) or “half-sib” (HS). As sea urchins may be sexually mature before reaching the adult size class, relationships between adults and between adults and recruits can be either “unrelated” (U), “full-sib” (FS), “half-sib” (HS) or even “parent-offspring” (PO). COLONY could not retrieve PO relationships among the dataset of adults (data not shown).

Parentage analysis

As adults and recruits-of-the-year were sampled, we attempted to detect putative parent-offspring pairs, using CERVUS (Kalinowski et al. 2007), ML-relate and COLONY software.

We first used CERVUS software, which accurately identifies parent-offspring pairs in empirical data (Slate et al. 2000). Specifically, we tested the observed assignments between adults of each location against whole recruits of the dataset. In an initial step, we evaluated the number of observed assignments as a function of the expected proportion of adults sampled; we tested a range of values between 0.01 and 90% (i.e. for the “adult proportion sample” parameter). For each assignment simulation, CERVUS calculates a critical LOD score, set at 95% confidence, which were always higher or equal to 4.7. In a conservative approach that accounts for a proportion of non-sampled putative parents within a population, we performed the assignment tests with the assumption that 1% of the putative parents within each location were sampled. This was the value that retrieved the lowest number of parent-offspring pairs, with only critical LOD scores higher or equal to 4.7 being considered. We retained candidate parent-offspring pairs when at least 10 loci could be compared between pairs of individuals, with a maximum of 2 loci mismatches, as the odds of observing the absence of a shared allele at a locus increases with the presence of null alleles. Thus, two homozygotes at a locus, displaying different alleles, may share an unknown null allele but will mismatch at this locus. Afterwards, the genotype of each pairwise individual was checked visually for mismatches to prevent false peak assignments.

COLONY and ML-relate software were then used to verify parent-offspring assignments obtained with CER-VUS. The assignment tests performed with COLONY considered the whole set of putative parents against the whole recruit dataset. We used the same parameters as described above, with a probability of being a parent (i.e. a mom) arbitrarily set at 0.5. As presented above, only pairs with a posterior-probability of 0.99 were considered. To assess the reliability of parent-offspring assignments, we ran a supplementary analysis with the initial database in which we included two known parent-offspring pairs and 20 of their offspring. COLONY efficiently retrieved all parent-offspring pairs, whilst no pairs were observed between the included parents and the offspring of the initial dataset (personal data). The assignment tests performed with ML-relate were those performed to assess the relationship between pairs of individuals within the whole dataset as described above.

Finally, putative parent-offspring pairs retrieved by all three contrasting programs were considered reliable.

Results

Characteristics of the new microsatellite loci and within population genetic diversity

Among the 8 new loci selected, 2 are located within untranslated regions (UTR), 4 within exons, 1 within an intron and 1 is likely to be intergenic (Suppl. data1). They displayed marked differences in both their allelic diversity and null allele frequencies depending on their location within the genome. Indeed, markers located within exons or UTR were less polymorphic than those located within UTRs or intergenic regions, which was expected as mutation within messenger RNA can result in abnormal, non-functional protein or abnormal abundance. Remarkably, the ms40 marker, which is located within the first exon of the bindin gene coding sequence was found to be highly polymorphic.

Overall, the 12 markers used in this study were found to be polymorphic in all populations, with the total number of alleles ranging from 8 to 48 (mean = 27.5; Standard Deviation SD = 15.6) and the observed heterozygosity ranging from 0.033 (ms26; Vil2011r) to 1 (ms40; Gie2010a) (mean = 0.548; SD = 0.004). We found similar levels of allelic richness and deviation from HWE among groups for each marker, indicating that genetic diversity is maintained over generations (Suppl. Data2).

Analysis of the whole dataset with MICRO-CHECKER did not show scoring errors due to large allele dropout or stuttering, but revealed significant value (p < 0.05) for potential null alleles that could be of interest for most loci and all populations. Indeed, high FIS estimates were observed with 8 markers, associated with a strong deviation of the populations from Hardy-Weinberg equilibrium (Suppl. Data2). The correction for null alleles consistently reduced the high FIS values observed in the whole dataset (i.e. population combined) (Suppl. Data3), and globally increased the p-value for the HWE test (data not shown), which reveals that null alleles likely contributed to the observed HWD.

When combining the analysis performed within the 33 cohort-by-locations, 266 loci pairs out of 2178 tested, displayed significant linkage disequilibrium after B-H correction. We did not find similar patterns of LD among groups, suggesting that loci are not physically linked.

Population structure

Pairwise FST estimates generated with and without correction for the presence of null alleles were significantly different. Overall, the correction for null alleles resulted in an increase of FST estimates (p-value < 0.001), specifically for significant pairwise comparisons. Thus, the analysis of genetic differentiation among cohort-by-locations was performed with null alleles included, corresponding to the configuration that lowered FST to fit conservative conditions.

The overall FST estimate was small but significant (AMOVA; FST = 0.004; 95% CI: 0.0007–0.005; p-value = 0.035) and the genetic diversity within cohort-by-locations explained 99.6% of the total genetic variance observed. At the spatial level, pairwise population differentiations were significant in only 3 out of 55 pairwise comparisons, and of similar intensity (0.0027 < FST < 0.0032) (Suppl. Data4). Within the cohorts of recruits-of-the-year, FST values were significant in 6 and 1 out of 55 pairwise comparisons, in 2011 and 2012 respectively (Table 1). FST values observed within the cohorts of recruits were on average 3.4 times higher than those observed at the population level. When applying the B-H correction for multiple comparisons, there remains one significant differentiation between Sic and Ray, in 2011.

Table 1 Spatial pairwise FST for each cohort

The Mantel tests did not demonstrate any distance isolation patterns for the three cohorts (P > 0.1; r < 0.1), and LOSITAN showed that the loci were most likely not under selection (Suppl. Data5).

Thus, on a regional scale, the entire population appears to be highly homogenous, consistent with the AMOVA and the transient spatial pair-wise population differentiations, observed within a 20 km range.

Cohort-by-location mean relatedness

The presence of null alleles significantly affected the mean genetic relatedness between individuals within the whole dataset: mean r-coefficients ranged between 0.037 and 0.041 with and without correction for null alleles, respectively. The mean r-coefficient calculated within cohort-by-location and cohort-by-locations pairs ranged from 0.0315 to 0.0715. The MCMC simulation yielded a simulated mean r-coefficient of 0.0414, which corresponded to the level of relatedness expected by chance alone.

When applying the B-H correction for multiple comparisons, we found instances of genetic relatedness within 6 cohort-by-locations out of 33 tested, and among 21 cohort-by-locations pairs out of 528 tested (Suppl. Data6).

The 6 instances of significant genetic relatedness within cohort-by-locations, correspond to the cohorts of adults from Bom, Tro, Gie and Vil, and the cohorts of recruits of Bom2011 and Vil2011 (Suppl. Data6).

We then assessed the mean relatedness between cohort-by-locations pairs. First, we considered the relatedness between recruits and adults that would be an indication of connectivity between two locations. We found 11 related pairs: i) recruits of Bom2011 were found related to the 5 cohorts of adults located in Car, Ray, Gie, StR and Vil; ii) recruits of StR2011 were related to the cohorts of adults of Vil, Gie and StR; iii) recruits of Vil2011 were related to the cohort of adults at Vil; iv) recruits of Sic2012 were related to the cohort of adults from StR and v) recruits of Mag2012 were related to the cohort of adults from Vil (Suppl. Data6). These results indicate that gene fluxes are detectable both over the whole geographical area of the study and at fine scales. Connections between localities are generally orientated eastward, and appeared to be recurrent. Westward connections had been observed twice at both the west and east of Cape Sicié.

Then, we considered the relatedness within each cohort. We found 5 related pairs within the cohort of adults involving Vil with StR, Bom with Mug, and StR with Gie and Mug. Within the recruit cohort of 2011, Bom was found related to Our and Vil. No related locations were observed within the recruit cohort of 2012 (Suppl. Data6). These results appeared consistent with those described above, as well as evidence that patches of genetically related recruits can occur at a very fine scale.

Finally, we found 3 related pairs among the Bom2011 recruits, with the geographically close locations of Tro, Gie and Sic from 2012 (Suppl. Data6). These results suggest recurrent recruitment of related individuals at a local scale.

Relatedness of pairwise individuals

Here, we have assessed the proportion of pairs of individuals related by their r-coefficient that were found either among or within localities, for the 3 cohorts. We found 45 out of 93,145 pairs of adults, 26 out of 109525 recruits2011 pairs and 29 out of 154026 recruits2012 pairs that displayed r-coefficients higher than 0.56. Below this value, pairs were mostly found among locations, although the relative proportion of those found within locations seems to increase with the r-coefficient (Fig. 2, recruits of 2011 and 2012). Within the cohorts of recruits, highly related individuals are mostly found within localities (p = 0.001) suggesting that at least a fraction of related individuals from a larval pool are likely to disperse cohesively (Fig. 2). In the case of the adult cohorts, highest r-coefficients could be found both within and among locations in similar proportions as expected, because adults probably belong to several diverse temporal cohorts (Fig. 2).

Fig. 2
figure 2

Distribution of individual pairwise r-coefficients within the dataset are represented with 0.01 intervals, for the cohort of adults (top), recruits of 2011 (middle) and recruits of 2012 (down). The dark and light grays represent the proportion of pairs composed of individuals harvested at the same (within) or at different (among) locations, respectively. Gap means that no pairs were retrieved for this level of relatedness

We then tested, for the three cohorts considered independently, whether the proportion of related individuals within each location would be correlated with local FST. We have only observed a similar correlation with the data of 2011 (r = 0.87; p < 0.001), indicating that cohesive dispersion might drive spatial population structuring. However, the persistence over time of such a correlation seems unlikely as suggested by the results obtained with the adults and the recruits of 2012 (Fig. 3).

Fig. 3
figure 3

Relationships between local differentiation (FST) and the proportion of related individuals (i.e., full-sibs and half-sibs identified with both COLONY and ML-relate software), found within each location for the cohorts of adults (top), recruits of 2011 (middle) and recruits of 2012 (down)

Sib-ship and parentage analysis

The relationships between pairs of individuals were assessed to reflect obvious patterns of dispersion over the studied area. ML-relate identified 113877 half-sib pairs (HS), 4252 full-sib pairs (FS) and 2428 parent-offspring pairs (PO) among 1021735 pairs tested. By combining the results obtained with COLONY and using conservative detection thresholds, we identified a total of 35 putative full-sibs and 26 putative half-sibs among the full dataset (Table 2). None of these sib-pairs were retrieved by chance alone in the 5 permutated datasets (data not shown). Also, all pairs were independent.

Table 2 Summary of relationships retrieved with COLONY, ML-Relate and Cervus software with conservative conditions

Sib-ship

Within the cohorts of recruits, we found 5 (i.e. 4 FS and 1 HS) and 4 sib-pairs (i.e. 2 FS and 2 HS), for 2011 and 2012, respectively. It is noteworthy that each of the sib-pairs consisted of individuals harvested at the same location, except 1 HS of 2012. Furthermore, they were all found in locations encompassed within the Cape Sicié sampling zone (Suppl. Data7). These results are consistent with the cohesive dispersion of larvae as outlined above. Similar results were obtained within the cohort of adults. We only found 10 full-sib pairs, of which 2 sib-pairs were composed of individuals found at the same location (i.e. Mug and Car). The other 8 sib-pairs were composed of individuals harvested at Ray paired with individuals harvested at StR (Suppl. Data7).

Between cohorts of recruits 2011 and 2012, 8 half-sib pairs were identified, of which 2 pairs were composed of sibs harvested at the same location (i.e. Vil and StR) and 6 pairs consisted of individuals from different but geographically close locations, between Ray and Gie (Suppl. Data7). This highlights that siblings from successive cohorts possibly recruit within the same local area, indicating similar patterns of connectivity from year to year. This assumption appears to be supported by sib-pairs retrieved between the cohorts of adults and recruits. Indeed, individuals of sib-pairs were all from different but mostly close locations, again suggesting a recurrent pattern of connectivity (Suppl. Data7). Thus, overall, these results show that self-retention is a frequent occurrence, and that a fraction of adults probably contributed to several successive generations.

Parentage analysis

The parentage analysis performed with Cervus software initially resulted in 37 putative pairs. When combining the analysis with both ML-relate and COLONY, using a set of conservative thresholds, 5 unique parent-offspring pairs (PO) were finally identified (Suppl. Data7). Our analysis of parentage could only assign one likely putative parent to a recruit. No couples were retrieved.

Four out of the 5 PO were observed at a local geographic scale between Bom and Our, which again highlights that self-retention is not a rare event. By contrast, we found one pair involving a parent and offspring located at remote locations. The putative parent was located at Mug and the offspring at Vil, thus almost encompassing the wide geographical range of the studied area.

Discussion

In this study, we have documented the population genetic structure and connectivity within a geographic area that roughly equates to that of the larval dispersion of P. lividus (Siegel et al. 2003). Our aim was to assess to what extent geographically close populations of the sea urchin P. lividus, with a moderate to long pelagic larval duration, were actually connected at a geographic scale below that of the larval dispersion potential. This issue is relevant for ecologically fragile and/or exploited species, mainly when concerns about resource sustainability are expressed. Many population genetic studies, performed on benthic species with a pelagic larval stage, have demonstrated high heterozygote deficiencies, chaotic genetic patchiness and kin aggregation, suggesting that connectivity likely occurs at fine scales (Veliz et al. 2006; Iacchei et al. 2013). However, no study has specifically addressed these issues in P. lividus at fine geographic scales, apart from Paterno et al. (2017) where a Lagrangian model of dispersion was used to reveal the likely pattern of larval dispersion within the Adriatic basin. However, to date, these results have not been genetically confirmed. In an attempt to detect small scale connectivity, we genotyped a large number of individuals of 11 populations, of which 8 were distributed within an area approximately 40 km wide.

Though there is no data on the population census size, we assumed that our sampling effort would allow us to retrieve genetically related individuals, as has already been shown in recent studies on the acorn barnacle Semibalanus balanoides (Veliz et al. 2006) and the black-faced blenny Tripterygion delaisi (Schunter et al. 2014). We considered 3 cohorts: a cohort of adults, with individuals bigger than 4 cm in diameter, and 2 successive cohorts of recruits-of-the-year. It is highly probable that adults belonged to several different generations because of the great variability of growth rates in this species (Calderón et al. 2009b). On the contrary, we sampled juveniles that were less than 1 cm in diameter because we could assume, based on the research of Calderón et al. (2009b), that they were probably produced during the former spawning event.

Globally, we observed a high level of genetic diversity and a weak genetic structure, as previously reported in other regions (Calderón and Turon 2010a; Calderón et al. 2012; Maltagliati et al. 2010; Paterno et al. 2017), and for other moderate to highly dispersive species (Iacchei et al. 2013). Nevertheless, we found a transient spatial genetic structure, below the potential dispersion range of the pelagic larvae, which is consistent with a pattern of chaotic genetic patchiness (CGP), first described by Johnson and Black (1982). This had not been described in P. lividus, though several studies have assessed the spatial genetic variability of populations of recruits (Calderón and Turon 2010a; Calderón et al. 2012). However, this pattern has regularly been reported in other dispersive marine species (Johnson and Black 1982; Hogan et al. 2010; Iacchei et al. 2013; Aglieri et al. 2014), underlying that CGP are possibly common in P. lividus populations.

In broadcast marine species, the large variance in reproductive success among adults (Hedgecock 1994, Hedgecock and Pudovkin 2011), as well as kin aggregation (Selkoe et al. 2006; Broquet et al. 2013; Iacchei et al. 2013; Aglieri et al. 2014; Selwyn et al. 2016), are amongst the main reasons to explain CGP, yet large variance in spatial recruitment (Hereu et al. 2004; Tomas et al. 2004; Hedgecock 1994 and Hedgecock and Pudovkin 2011) and selection (Cornwell et al. 2016; McKeown et al. 2017) are other possible causes. Indeed, during spawning events, some extremely fertile progenitors gather to form patches of reproductive aggregates within a location (Pennington 1985; Levitan and Sewell 1998; Boudouresque and Verlaque 2001; Seamone and Boulding 2011), which themselves may differ in their genetic composition. Their larvae potentially disperse and recruit cohesively, resulting in some proportion of related individuals within a location (Selkoe et al. 2006; Broquet et al. 2013; Iacchei et al. 2013; Aglieri et al. 2014; Selwyn et al. 2016). In this case, the genetic diversity within the pool of new recruits at one location is expected to be reduced when compared to the genetic diversity of the adults. We did not, however, observe any reduction. Indeed, considering the numerous groups of adults along the coast and the dispersive potential of larvae, recruits-of-the-year at one location likely result from multiple pools of larvae that were produced at different source locations. Thus, the sum of each fraction of adults that contributed to the next generation at one other location, may finally represent the full genetic diversity observed within the entire population, thus challenging the use of FST approaches alone to understand the processes of structuration and connectivity (Lowe and Allendorf 2010; Marko and Hart 2011).

By combining the FST approach and genetic relatedness analysis, we have provided additional evidence, from a field study, that related P. lividus larvae produced by one source population can disperse cohesively until settlement within the same location, potentially resulting in a detectable spatial genetic structuration within a year class. These results are consistent with previously reported observations and assumptions (Selkoe et al. 2006; Broquet et al. 2013; Iacchei et al. 2013; Aglieri et al. 2014; Eldon et al. 2016).

Patterns of connectivity were assessed using mean genetic relatedness and relationship as proxies (Veliz et al. 2006; Iacchei et al. 2013; Schunter et al. 2014) that were comparable in intensity to those obtained in similar studies for dispersive species (Veliz et al. 2006; Iacchei et al. 2013; Aglieri et al. 2014; Schunter et al. 2014; McKeown et al. 2017). Our results suggest that there is connectivity at a geographic scale below the expected potential dispersion range of larvae, a limited width of larval dispersion, and local retention. These results are consistent with recent studies (Morgan et al. 2009; Shanks 2009; Shanks and Shearman 2009), showing that an important fraction of the larval pools probably do not disperse far from their birth place. One explanation could reside in coastal boundary layers (CBL) that would markedly alter transport distances, widths of dispersal distribution, and the fraction of larvae retained near their birth place (Nickols et al. 2015).

Within our study area, dispersion was mainly oriented towards the west and this pattern seemed recurrent, with only a few occurrences detected in the opposite direction. This result is consistent with the expected hydrodynamic features, driven by the main westward, Ligurian circulating current, wind-driven surface currents and benthic topography, that could explain the dispersion of other planktonic species within the same area (Molinero and Nival 2004). It is interesting to note that we could not observe a westward dispersion beyond Mug, where the continental shelf appears. In this location, it is possible that coastal larvae are mostly spread out towards the open sea, increasing the mixing of unrelated larvae, before being transported back to the shore. This movement pattern might be significant over our entire study area, despite the putative presence of CBL. As our estimates of connectivity depend largely on mean genetic relatedness and few parent-offspring relationships, connectivity would remain undetected if too many unrelated individuals are sampled within a group. Thus, the pattern of connectivity described here, probably represents part of the actual full connectivity within the studied area, as we neither considered the influence of other populations located within and beyond the range limits.

The coupling of next-generation sequencing with a physical model of dispersion will help improve our knowledge of dynamics in highly dispersive species (Levins 2006), as recently performed on P. lividus by Paterno et al. (2017). However, input data often require refinements, both concerning life history traits (i.e. age of the individuals, gender, estimations of the spawning timing and larval development), and hydrodynamic features such as the CBL (Nickols et al. 2015; Paterno et al. 2017).

As in previous studies on broadcast spawning species (Hedgecock et al. 2004; Addison and Hart 2005), we found highly significant heterozygote deficiencies, relative to expectations, under the assumption of Hardy-Weinberg equilibrium. These results can be due to the Wahlund effect, non-panmixia (inbreeding, non-random mating), and selection or genotyping errors. Observed heterozygote deficiencies mostly affected microsatellite loci located within non-transcribed regions. For these loci, the deficit of heterozygotes that we observed were highly comparable, in intensity, to those previously reported (Calderón et al. 2009a; Calderón et al. 2009b; Couvray et al. 2015). Calderón et al. (2009b) suggested that null alleles would be unlikely to explain a deficit such as this, because they had not observed any homozygotes for null alleles. They instead suggested that positive selection acting on the gamete recognition bindin protein would result in non-random mating, contributing to an excess of homozygotes in their samples.

On the contrary, we found homozygotes for null alleles, with every locus, which were more frequent for loci within non-transcribed regions, compared to loci located within transcribed regions. Though technical artefacts (i.e. errors of PCR amplification) may still have occurred, these results suggest the presence of obvious homozygotes for null alleles. Thus, null alleles would indeed contribute, at some level, to the deficit of heterozygotes. In echinoderms, null alleles are a common feature of microsatellite markers and have already been reported in other sea urchin species (Addison and Hart 2002; Mccartney et al. 2004). The presence of null alleles in our dataset probably hindered the detection of some pairs of related individuals (Dakin and Avise 2004). Therefore, the number of related individuals, as well as the levels of relatedness calculated within and among groups are probably low estimations.

Moreover, we did not detect any sign of positive selection in the markers used, especially on ms40 that is located within the first exon of the gene coding for the bindin protein (Zigler 2004; Calderón et al. 2009c). However, positive selection has been shown to act on amino acid sites located in two regions flanking the conserved core of the protein (Calderón et al. 2009c; Calderón and Turon 2010a), whilst Zigler and Lessios (2004) found a correlation between extensive variation in the number of repeats in the 5’ of the core region (i.e. ms40) and positive selection in the nonrepeat region of the Paracentrotus genera. It should be noted that among the markers located within coding regions, ms40 was indeed found to be highly polymorphic. Thus, although the FIS values obtained with microsatellite loci located within transcribed regions do not support the non-random mating hypothesis, this should nevertheless be considered with caution as, apart from ms40, the markers displayed a small number of over-represented alleles. Overall, these results appear consistent with non-random mating suggested by Palumbi (1999) and Calderón et al. (2009a); Calderón et al. (2009b), which would likely contribute to the observed deficit of heterozygotes.

Also, though the Walhund effect is an unlikely explanation in highly dispersive species, inbreeding would on the contrary, significantly contribute to the heterozygote deficiencies. Indeed, as the most genetically related recruits were found within locations, and as the pattern of connectivity seemed recurrent among years, we argue that related individuals might reproduce in potentially significant proportions, increasing the frequency of homozygotes within the population, as previously suggested by Veliz et al. (2006).

Thus, our results of genetic structuring and relatedness, obviously appear consistent with the hypothesis of non-random mating and variance in contributing females, leading to significant inbreeding. Depending on the stochasticity of hydrodynamic features during a spawning season, this might indeed lead to a significant number of related individuals within a location, as discussed above. However, the proportion of sib-pairs within a location cannot obviously be estimated, because the census size of the population of recruits within a location, is rather difficult to assess or predict in such a species (Ovenden et al. 2016). Exhaustive sampling of the recruits would be time consuming and costly, and ultimately an unfeasible proposition. In future research, it would be interesting to experimentally assess the relative contributions of each of these factors to inbreeding, for instance by investigating the diversity that each reproductive aggregate would produce within source populations. Also, the hypothesis of non-random mating should be confirmed in P. lividus, by testing gametes incompatibilities, as well as a potential assortative mating, as suggested in Paracentrotus gaimardi (Calderón et al. 2010b). Notwithstanding the influence of complex nearshore environments and the high pre-reproductive mortality in free spawning species such as P. lividus, these knowledges would likely contribute to the better understanding of the processes that drive the dynamic and the genetic structuring of populations at fine spatial scale.

To conclude, from a conservation point of view, we showed that populations were regularly replenished by larvae produced by sufficient numbers of progenitors to ensure steady state allelic diversity, and that a fraction of the new recruits within each population were probably produced locally. This regional scale genetic study, completes a 5-year demographic surveillance of P. lividus populations flanking Cape Sicié, which did not demonstrate any significant demographic decrease in P. lividus, even at the end of the legal harvesting season (personal data of Couvray, PhD thesis). Thus, although experiencing intense harvesting, recurrent disease and environmental stress, our results predict that the P. lividus population is healthy and does not require management strategies to bolster low population densities, for instance, via the replenishment of depleted populations with hatchery produced individuals, as previously discussed (Couvray et al. 2015; Segovia-Viadero et al, 2016). On the contrary, we believe that the natural rescue effect of depleted localities, ensured by the existing continuum of source populations along the coast, should be supported, if necessary, by the establishment of temporary no-take zones.