Unlike most other ecosystems, deep-sea hydrothermal vent communities rely on chemosynthetic primary production, where chemoautotrophic microbes perform redox reactions to produce energy for carbon fixation. Many vent animals host sulfide- or methane-oxidizing bacteria as endo- or ectosymbionts (Dubilier et al. 2008), leading to biomass values that are among the highest on the planet (Ramirez-Llodra et al. 2010). Their dependence on chemosynthesis restricts vent species to tectonically or volcanically active seafloor regions. The patchy distribution and ephemeral nature of hydrothermal habitats increases the vulnerability of vent species to environmental disturbances and makes their dispersal among active vent fields a nontrivial challenge. Financial and technical limitations often impede thorough investigations of these remote systems, so that details about connectivity and biodiversity remain poorly understood. Despite these shortcomings, plans for mining of seafloor massive sulfide (SMS) deposits present an acute and realistic risk to these unique biological communities (Van Dover 2011; Boschen et al. 2013). More seriously, the scarce information we have about population genetics and ecology of vent-endemic taxa comes primarily from studies in the eastern Pacific and mid-Atlantic Ocean ridge systems (Vrijenhoek 2010). However it is the less studied Indo-West-Pacific back-arc and ridge spreading centers that form current targets for SMS mining (SPC 2013). Licenses have been granted by the Fiji, Vanuatu and Tonga governments to explore SMS deposits in their territorial waters and Nautilus Minerals Inc. was granted the first mining lease for vent fields in the territorial waters of Papua New Guinea (for examples, see: The accelerating pace of licensing and exploration emphasizes the need for more knowledge about the genetic population structure, contemporary gene flow patterns and historical migration routes of Indo-Pacific species. Such data will aid estimates of the recovery potential of hydrothermal ecosystems subject to human impact.

Bathymodiolin mussels, hosting sulfur, methane, and/or hydrogen oxidizing bacterial symbionts (Petersen et al. 2011), are among the dominant taxa in deep-sea reducing environments worldwide. In the western (W.) Pacific, they frequently form extensive beds (Fig. 1) and dominate both biomass and community dynamics (e.g., Sen et al. 2014). More than 50 species (named and unnamed operational taxonomic units) are presently recognized (Lorion et al. 2013), where species currently placed in the genus Bathymodiolus are the most diverse at hydrothermal vents.

Fig. 1
figure 1

A bed of Bathymodiolus mussels at NW Eifuku Seamount. Photo from ROPOS (Canadian Scientific Submersible Facility); courtesy NOAA Ocean Exploration

Three morphospecies with similar features were described from the W. Pacific Ocean: B. septemdierum Hashimoto and Okutani (August 1994); B. brevior Von Cosel et al. (October 1994, p. 375); and B. elongatus Von Cosel et al. (October 1994, p. 379). A fourth, B. marisindicus Hashimoto (2001), was described from the Central Indian Ridge (CIR), although evidence from a single mitochondrial gene (ND4 = NADH dehydrogenase subunit 4) indicated 99.2 % sequence identity with W. Pacific B. brevior (Van Dover et al. 2001). Subsequent studies of additional mitochondrial and nuclear genes found little evidence for differentiation among the Indian Ocean and various W. Pacific populations and raised doubts about their status as distinct species (Miyazaki et al. 2004; Jones et al. 2006; Won et al. 2008). Kyuno et al. (2009) conducted phylogeographic analyses of B. brevior samples from the North Fiji Basin (NFB), B. septemdierum from the Myojin Knoll (southern Izu-Bonin Arc), and B. marisindicus from the Kairei (KA) field in the Indian Ocean. Based on their analysis of mitochondrial ND4, Kyuno et al. proposed that: (1) essentially unimpeded gene flow exists between W. Pacific populations of B. brevior and B. septemdierum separated by about 5000 km; (2) B. marisindicus is “not isolated” from the W. Pacific populations, though gene flow is “relatively limited”; and (3) the ancestor of the three species might have migrated from the Southern CIR to the Izu-Bonin Arc via the Southwest Pacific.

Based on the existing molecular evidence, Thubaut et al. (2013) recognize B. septemdierum as the prior synonym for B. brevior and B. marisindicus. Although they keep B. elongatus as a separate species, they concede that this morphotype might be conspecific as well, given the low evolutionary distances to B. septemdierum. Von Cosel and coworkers (pers. comm., 30 October 2014) are using morphological criteria and all the available molecular data to revise the taxonomy of Bathymodiolus mussels.

The present study is intended to provide more robust biogeographic and genetic foundations for revision of the B. septemdierum complex. We examined samples from 10 vent fields distributed among five Indo-Pacific locations including ridge and back-arc spreading centers, plus volcanic arcs (Table 1; Fig. 2). Multilocus genotypes of individual mussels were based on 19 gene regions: 11 allozyme loci, four nuclear DNAs, and four mitochondrial DNAs. Genotypic assignment methods recognized two regional contemporaneously isolated metapopulations inhabiting the CIR and W. Pacific back-arc spreading centers.

Table 1 Bathymodiolus septemdierum complex sampling localities
Fig. 2
figure 2

Geographic distribution of the B. septemdierum complex in the Indo-Pacific. Sampled sites are: KA Kairei, ED Edmond, EF NW Eifuku, MT Mariana Trough, NF Nifonea, WL White Lady, KM Kilo Moana, TC Tow Cam, TM Tui Malila, HH Hine Hina. Unsampled sites with known occurrences according to ChEssBase/GBIF, Desbruyères et al. (2006), Miyazaki et al. (2010) and Tunnicliffe et al. (2009) are: a Mokuyo Seamount and Suiyo Seamount (Izu-Bonin Arc, morphotype septemdierum), b Myojin Knoll and Sumisu Caldera (Izu-Bonin Arc, morphotype septemdierum), c Mussel Valley (North Fiji Basin, morphotypes brevior and elongatus), d Monowai (Tonga Arc, morphotype brevior), e Hatoma Knoll (Okinawa Trough, morphotype septemdierum, only 1 specimen)

Materials and methods


Mussel specimens were obtained from 10 vent habitats in the W. Pacific and Indian oceans that were visited during multiple expeditions between 1992 and 2007 (Table 1; Fig. 2). Depths ranged from about 1500–3600 m. Upon collection with remotely operated vehicles, samples were either preserved in 70 % ethanol or dissected and stored at −80 °C.

DNA isolation, PCR and sequencing

DNA extraction, PCR amplification, amplicon purification and sequencing methods mirrored those of Génio et al. (2008) and Johnson et al. (2013). Mussels were genotyped at four nuclear (Cat = catchin, Col-1 = collagen type XIV, EF1α = elongation factor 1α, H3 = histone 3) and four mitochondrial (tRNA Met = transfer RNA for methionine, tRNA Val = transfer RNA for valine, ND4 = NADH dehydrogenase subunit 4, COI = cytochrome-c-oxidase subunit I) loci that were found to contain polymorphic sites in B. septemdierum (Table 2). These loci were chosen from a test suite of 10 genes and 17 primer pairs, as they were the only ones that consistently provided usable sequences. All obtained sequences were deposited in GenBank under accession numbers KP879256–KP881222 (Table 2).

Table 2 Sequence characteristics for the eight gene loci analyzed in this study


Allozymes encoded by eleven gene loci (PGDH = 6-phosphogluconate dehydrogenase, G6PDH = glucose-6-phosphate dehydrogenase, AAT1 = aspartate aminotransferase 1, AAT2 = aspartate aminotransferase 2, IDH2 = isocitrate dehydrogenase 2, LAP1 = leucine aminopeptidase 1, MPI = mannose-6-phosphate isomerase, PEP-GL = peptidase with glycyl-leucine, GPI = glucose-6-phosphate isomerase, PGM = phosphoglucomutase, LDH = lactate dehydrogenase) were examined from the subset of samples that were frozen immediately following their collection (Tables 1, 3). For each specimen, an approximately 0.2 g piece of adductor muscle was homogenized in a roughly equal volume of extraction buffer (0.01 M Tris, 2.5 mM EDTA, pH 7.0), and the homogenate was centrifuged at 12,000×g for 2 min to remove tissue debris. Screening employed cellulose-acetate gel-electrophoresis (CAGE) according to the conditions, buffers and stains outlined by Hebert and Beaton (1989).

Table 3 Allozyme loci and optimal buffers used for analyses

Sequence analysis

Forward and reverse sequences for each sample and gene were quality trimmed, clipped and paired applying the De Novo Assemble tool with highest sensitivity in GENEIOUS v7.1.5 ( If required, base calls were corrected manually. Subsequently, consensus sequences were multiple aligned with the integrated MUSCLE program using 20 iterations. To check for the accuracy of the PCR amplifications, the resulting nucleotide alignment was compared against the nr database with MEGABLAST choosing an e-value of 1e-20 and a minimum similarity of 75 %. Nuclear alleles in heterozygous individuals were further resolved with PHASE v2.1.1 (Stephens et al. 2001; Stephens and Donnelly 2003). For ensuring the reliability of the results, we chose 10,000 iterations of the MCMC chain, a burnin of 1000 and five different seeds for the random number generator. As tRNA Met, tRNA Val and ND4 were amplified as a continuous sequence, ARWEN v1.2 (Laslett and Canbäck 2008) was used to predict the exact tRNA gene boundaries and separate the three loci. Due to problems with PCR amplification or sequence quality, samples from EF had to be excluded from all analyses except for haplotype network generation and basic statistics.

Data preparation

For combined usage of DNA and allozymic data in Bayesian (STRUCTURE, BAYESCAN, BAYESASS) and general population genetic statistics (Hardy–Weinberg (HWE) and linkage (LE) equilibrium), sequence information for each locus was reduced to a number code by identifying and indexing the unique alleles. Data were subsequently transformed into appropriate input formats using CONVERT v1.31 (Glaubitz 2004), FORMATOMATIC v0.8.1 (Manoukis 2007) or PGDSPIDER v2.0.7.2 (Lischer and Excoffier 2012). In all other analyses, truncated sequence data were utilized.

Molecular evolution and tests for selection

As a preparation for further analyses we assessed the nucleotide substitution scheme for the eight nuclear and mitochondrial genes in JMODELTEST v2.1.6 (Darriba et al. 2012; Guindon and Gascuel 2003) by evaluating 5 possible models with the Bayesian Information Criterion (BIC). Unlike commonly used likelihood ratio tests the BIC has the advantage of comparing both nested and non-nested models, determining model uncertainty and performing model averaging (Posada and Buckley 2004). For each partition we considered rate variation among sites (−g 4), a proportion of invariable sites (−i) and unequal base frequencies (−f). Nucleotide (π) and haplotype/gene (H) diversities were assessed with ARLEQUIN v3.5.1.2 (Excoffier and Lischer 2010), using the Tamura-Nei distance and a gamma correction according to the inferred substitution models. Sequence divergence was determined in MEGA6 (Tamura et al. 2013). We used a bootstrapped Maximum Composite Likelihood method (500 iterations), including both transitions and transversions and considering heterogeneous patterns among lineages. Substitution rates among sites were assumed to be uniform or modeled according to a gamma distribution, if suggested by JMODELTEST. Missing information was excluded following the pairwise deletion procedure. To validate the selective neutrality of our markers for usage in BAYESASS, we performed two types of selection tests. We used MEGA6 to conduct codon-based Z-tests for selection on protein-coding DNA sequences. However, as the amplified exonic fragments were short for Cat, Col-1 and EF1α, these markers were excluded from this analysis. We used the original or modified Nei-Gojobori method with Jukes-Cantor correction and conducted 500 bootstrap replications for each test. Gaps and missing data were treated as described. Complementarily, we used BAYESCAN v2.1 (Foll and Gaggiotti 2008) to conduct F ST -outlier tests for (pseudo-)selection on the allozyme and DNA markers. Default parameters were used for run length and prior odds (i.e., a neutral evolution model was considered as 10 times more likely than an adaptive one). To avoid detection of false positives, we excluded monomorphic loci from the analyses. A locus was considered as outlier, if it had a q-value <0.1.

Genetic structure

To identify the degree of partitioning between populations we applied the program STRUCTURE v2.3.4 (Pritchard et al. 2000). We chose the admixture model with correlated allele frequencies and a sample site prior (Falush et al. 2003; Hubisz et al. 2009) to infer the most likely number of genetic clusters K. The MCMC chain was simulated 106 times with a burnin period of 105. Posterior probabilities were calculated for K ranging from 1 to 11 with five replications for each value and corrected after the ΔK method by Evanno et al. (2005). Tests of Hardy–Weinberg and linkage equilibrium (α = 0.05) within populations were done in ARLEQUIN and adjusted for type 1 error, using the Benjamini–Yekutieli False Discovery Rate (BY FDR) procedure (Benjamini and Yekutieli 2001). Compared to sequential Bonferroni correction, this method has the advantage of not losing test power while controlling for error rates in multiple comparisons (Narum 2006). Only two of the allozyme loci (Mpi and Gpi) were sufficiently polymorphic to warrant inclusion in the STRUCTURE analyses. The four mitochondrial markers (hereafter mtDNA) were concatenated into a single data string for each individual. We examined the consistency of the results by running analyses in four ways: (1) all polymorphic markers (Cat, Col-1, EF1α, H3, Mpi, Gpi, plus mtDNA); (2) the neutral nuclear markers alone (Cat, EF1α, Mpi, Gpi); (3) the DNA sequence markers alone (Cat, Col-1, EF1α, H3, plus mtDNA); and (4) the neutral nDNA markers alone (Cat, EF1α). STRUCTURE results were converted to vector graphics using the RUBY script “bar_plotter.rb” ( to improve image quality.

As a measure of population differentiation, pairwise F ST (allozymes) and Φ ST (mtDNA, nDNA) values were estimated in ARLEQUIN v3.5.1.2, using non-parametric permutation procedures with 10,000 replications. Haplotype distance matrices for Φ ST estimations were calculated with the Tamura-Nei model. All tests were corrected following the BY FDR method, choosing 0.05 as α level.

Hierarchical structuring of population differentiation was tested with the Analysis of Molecular Variance (AMOVA) framework in ARLEQUIN v3.5.1.2, grouping demes according to regional affinities (W. Pacific versus Indian Ocean). Owing to missing genotypes in our datasets, covariance components and Φ-statistics were assessed for each locus separately and combined into a global weighted average. Statistical significance of the Φ estimates was evaluated by 10000 permutations.

Contemporary migration

Potential recent migration events based on non-linked, neutral polymorphic nDNA and allozyme markers (Cat, EF1α, Mpi, Gpi) were analyzed with BAYESASS v3.0—a Bayesian assignment software that can accurately detect immigrants over the last two generations if population pairwise F ST s are greater than 0.05 and loci are in LE (Wilson and Rannala 2003; Faubet et al. 2007). Since genetic differentiation between individual populations was low and initial test runs with unpooled data led to inconsistent results and low ESS values, we merged samples belonging to the same arc or mid-ocean ridge area (Table S1). In addition, we performed separate calculations for the LB samples to estimate connectivity within the W. Pacific back-arc spreading centers (Table S2). Runs were iterated 108 times, using a sampling interval of 2000 and discarding the first 107 simulations. Consistency of results was evaluated by repeating calculations 10 times with different random number seeds, while convergence was checked with TRACER v1.6 ( To achieve the recommended acceptance rates (20–60 %) for changes in mixing parameters, we set the step length for migration rates and inbreeding coefficients to 1.0 and the step length for allele frequencies to 0.6 (LB data set) or 1.0 (pooled data set). For obtaining the most reliable result, we calculated the Bayesian deviance for each run based on the R script by Meirmans (2014) and selected the output with the lowest value. The significance of the migration rate estimates for all pairs of sites was evaluated by assessing overlaps of the 95 % confidence intervals with those simulated for uninformative data.

Phylogenetic relationships

We used the MJ method in NETWORK v4.6.1.2 (; Bandelt et al. 1999; Forster et al. 1996; Saillard et al. 2000) to reconstruct phylogenies for mtDNA and nDNA sequence data. Weights of characters and mutation types were left at their defaults. All calculations were repeated three times with the epsilon parameter fixed at values of 0, 10 and 20 to validate the consistency of the results. Finally, redundant median vectors were removed from all networks using the MP option (Polzin and Daneshmand 2003). In cases where large gaps were present in the alignments, network calculations produced unreasonable results in terms of mutation steps or homoplasies. We therefore either excluded problematic sequences (2 Col-1 sequences) or marked indels in the network graphs (EF1α, Cat). As we did not observe any apparent sub-structuring at the population level, we pooled samples according to arc and mid-ocean ridge area to simplify the visual output.


DNA sequence variation and selection regimes

DNA fragments of eight genes were examined in mussel samples from 10 locations (Table 1). Haplotype and nucleotide diversities for mitochondrial COI and ND4 were among the highest observed in this study (Table 2). Although the tRNA Val haplotypes were less variable, nucleotide polymorphism was larger than in all other mtDNA genes. Overall, tRNA Met exhibited the lowest haplotype and nucleotide variation. Even so, the level of nucleotide polymorphism was similar to that of some nuclear genes (EF1α, H3). Phylogenetic networks generated for COI and ND4 were topologically similar (Fig. 3). The most frequent haplotypes in the W. Pacific samples were entirely absent from the Indian Ocean samples. The W. Pacific and Indian Ocean regions were almost completely differentiated for tRNA Val haplotypes, but the most common tRNA Met haplotype dominated both regions.

Fig. 3
figure 3figure 3

Phylogenetic networks for a mitochondrial and b nuclear DNA. Pie sizes for each network are proportional to the frequency of the respective gene variant (n = 1 for the smallest circle). White circles represent unknown/missing haplotypes. Numbers on branches show the number of mutations between alleles. If no numbers are indicated, only one mutation step occurred. Branch lengths are not to scale to improve visualization. *indel

In general, the protein-coding nuclear genes were less polymorphic than the protein-coding mitochondrial genes (Table 2; Fig. 3). Only Cat showed a haplotype diversity (H) that was comparable to that of COI and ND4, and its nucleotide diversity (π) was greatest of all other genes. H3 and EF1α had the lowest H and π diversities, respectively. The dominant H3, EF1α and Col-1 haplotypes were shared between the W. Pacific and Indian Ocean regions (Fig. 3). In contrast, the Indian Ocean samples were mostly unique for Cat haplotypes. A few rare and potentially private polymorphisms were observed for all genes. Except for Cat, the overall and net between-region distances were typically smaller for the nuclear than for the mitochondrial genes (Table S3). Overall and net between-region divergences for Cat were in the upper range of the mitochondrial values. Sequence divergence within the W. Pacific was usually close to zero and did not exceed 0.00018 for all genes (Table S4).

Various tests for selection differed among the loci. Codon-based Z-tests suggested purifying selection for all investigated sequences (i.e., COI, ND4, and H3; Table S3). F ST -outlier analyses were conducted with all the nuclear genes (Table S5; Fig. S1). Only Col-1 deviated from neutral expectations, possibly indicating balancing or negative selection.

Allozyme variation

Allozyme diversity was scored in the 8 frozen samples (Table 1). The number of alleles was lower than for the DNA markers, ranging from 1 to 5. Because only non-synonymous charge-based polymorphisms are revealed with CAGE methods, actual variation might be substantially greater. Two loci were polymorphic in all eight samples (Mpi and Gpi), three had rare alleles (q < 0.05) that occurred in two or more samples (G6pdh, Lap-1, and Pep-gl), four had alleles that were “private” to a single sample (Pgdh, Aat-1, Aat-2; and Idh-2), and two were monomorphic in all samples (Pgm and Ldh). Despite some minor differences in allele frequencies, all Indian Ocean and W. Pacific samples were usually dominated by the same most common variant, regardless of the locus analyzed (Table S6). F ST -outlier analyses were non-significant for all loci tested.

Hardy–Weinberg and linkage equilibrium

Tests for HWE indicated 20 significant departures from random mating expectations out of 62 possible comparisons. After BY FDR correction (α = 0.01061) the number of significant tests decreased to 10, but deviations were not uniform across populations or loci (Table S7). In the case of LE a total of 170 performable tests resulted in 26 significant patterns of genetic linkage, which were reduced to 13 significant associations following BY FDR correction (α = 0.00875). However, the pairing of genes was dissimilar among populations (Table S8). This lack of consistency suggests that our genetic data were free of statistical biases and that the observed departures from HWE and LE most likely resulted from scoring problems, allelic dropouts in heterozygous individuals (Piyamongkol et al. 2003) or limitations of the likelihood ratio test for samples with unknown gametic phase (Excoffier and Slatkin 1998).

Population structure

Regardless of the marker types, all STRUCTURE simulations produced qualitatively similar population clusters that distinguished samples from the Indian Ocean versus W. Pacific regions (Fig. 4, two examples). After BY FDR correction Φ ST indices based on the mitochondrial loci were significant for all pairwise comparisons of the Indian and W. Pacific samples (Table 4). In contrast, within-region comparisons were not significant except for contrasts involving TM and KM (Φ ST  = 0.0824), MT (Φ ST  = 0.2150) and NF (Φ ST  = 0.1077). Populations were less divergent for the nuclear genes, although all contrasts between Indian Ocean and W. Pacific samples were significant except for the KA-HH and ED-HH pairs (Table 4). All within-region contrasts were not significant. Pairwise F ST s based on the allozyme loci Mpi and Gpi were not significant with the exception of the ED-HH (F ST  = 0.1548), KM-HH (F ST  = 0.1262) and TC-HH (F ST  = 0.0724) comparisons (Table 5).

Fig. 4
figure 4

STRUCTURE analysis. Bar plots showing the clustering of individuals based on K = 2 and a all polymorphic markers (Cat, Col-1, EF1α, H3, Mpi, Gpi, concatenated mtDNA) and b only neutral polymorphic markers (Cat, EF1α, Mpi, Gpi). Each vertical line represents one mussel sampled at the respective location, where numbers on the left indicate the genetic content an individual inherits from each cluster. KA Kairei, ED Edmond, KM Kilo Moana, TC Tow Cam, TM Tui Malila, HH Hine Hina, WL White Lady, MT Mariana Trough, NF Nifonea

Table 4 Pairwise Φ ST s for mtDNA (above diagonal) and nDNA (below diagonal)
Table 5 Pairwise F ST s for the two most polymorphic allozyme loci Mpi and Gpi

AMOVAs were conducted separately for mtDNA, nDNA and allozymes (Table 6). Most of the mtDNA variance (66.84 %) occurred between regions, a small but statistically significant amount (0.49 %) occurred among samples within regions, and 32.68 % resided within samples. Less of the nDNA variance (22.40 %) occurred between regions, an insignificant amount (0.32 %) occurred among samples within regions, and most (77.28 %) resided within samples. In contrast, nearly all of the allozyme variation (97.35 %) resided within samples, no significant variation (−0.28 %) occurred between regions, and little existed among samples within regions (2.93 %).

Table 6 AMOVA results for the hierarchical grouping of 9 sampled sites into ocean region


BAYESASS outputs implied a strong self-seeding rate (≥0.6828; fraction of individuals migrating from source vent per generation) (Table S9; Table S10). On a broader scale, limited immigration was only seen from LB to NFB (0.2330) and to VA (0.2709). By contrast, no significant immigration was detected for the LB localities. As populations were poorly differentiated within the W. Pacific, we cannot exclude that some migration events have been overlooked or that a fraction of immigrants were wrongly assigned due to limitations of the BAYESASS algorithm (Faubet et al. 2007; Meirmans 2014). Additionally, the sample size for LB was higher than for the other demes, so that detection of contemporary emigration events from this site might have been facilitated. Nevertheless, all 10 of the BAYESASS replicates produced the same outcomes and varied only slightly in the Bayesian deviance (pooled data set: 4884.783–4888.187; LB data set: 1927.533–1928.588). We experienced no problems with convergence and we achieved high effective sample sizes (ESS > 200) for all parameters. This indicates that most of the estimates should be reasonably accurate.


Analyses of genetic variation within the Bathymodiolus septemdierum complex confirmed earlier findings of low differentiation among the morphospecies and populations (Kyuno et al. 2009). Nonetheless, assignment procedures partitioned the multilocus genotypes into two distinct regional metapopulations: (1) all sampled W. Pacific localities including the nominal species B. septemdierum, B. brevior and B. elongatus; versus (2) the Indian Ocean samples comprising the nominal species B. marisindicus. Despite overall low levels of sequence divergence, Cat alleles and mitochondrial haplotypes for ND4, COI and tRNA Val were almost completely distinct between the two regions. On the other hand, gene variants at the other nDNA and allozymic loci as well as tRNA Met were broadly shared between the Indian and W. Pacific samples. These discordances in mitochondrial and nuclear differentiation were reflected by the fixation indices based on AMOVA Φ-statistics (Φ CT (mtDNA) = 0.6684; Φ CT (nDNA) = 0.2240; F CT (allozymes) = –0.0028). Such differences among genetic marker systems are fairly typical, resulting in part from the inverse relationship between effective population size (2–4 times smaller for mtDNA) and rates of lineage sorting (reviewed in Toews and Brelsford 2012), from synonymous substitutions affecting DNA sequences versus the inability to see them with allozyme electrophoresis, and from discrepancies in selective constraints experienced by these markers.

For example, the H3 exon showed evidence of purifying selection according to the codon-based Z-test, but the F ST -outlier test failed to detect selection, although the populations were almost invariant. Perhaps the same few alleles provided a universal advantage, which is in concordance with the essential function of H3 in DNA packaging and the slower mutation rate (= lower allelic diversity) of conserved genes (Brown et al. 1979). In contrast, differentiation of the allozymes and EF1α did not deviate from neutral expectations. The broad Indo-Pacific distributions of the most frequent alleles might have resulted from incomplete lineage sorting—a phenomenon that is especially pronounced in recently separated populations (Pamilo and Nei 1988; Degnan and Rosenberg 2009). F ST -outlier tests suggested possibly balancing selection for locus Col-1, which might point to genetic hitchhiking scenarios (Barton 2000). Balancing selection at a closely linked locus could eliminate differences between populations while maintaining Col-1 diversity. In contrast, Cat polymorphisms appeared to be selectively neutral and almost fixed between the Indian Ocean and W. Pacific samples, maybe due to a founder event resulting from colonization of W. Pacific vents by Indian Ocean emigrants (Desbruyères et al. 2006; Kyuno et al. 2009). Variation at the two protein-coding mitochondrial markers, ND4 and COI, appeared to be shaped by negative selective forces, as expected for a rapidly evolving clonal genome that is susceptible to an accumulation of slightly deleterious mutations (i.e., Muller’s ratchet; reviewed in Stewart and Larsson 2014). Although purifying selection appears to be contradictory to the observed levels of polymorphism, mtDNA variation might be maintained due to increased mutation rates (Brown et al. 1979), co-evolution with the nuclear genome (Dowling et al. 2007) or doubly uniparental inheritance, which is known to occur in some mytilid species (Zouros et al. 1994; Skibinski et al. 1994; Hoeh et al. 1996; Zouros 2013). Interestingly, the mitochondrial tRNAs showed contrasting results concerning the Indo-Pacific divergence. While variation at both loci is likely to be influenced by close linkage to ND4, changes in the tRNA Met gene might be constrained by its essential role in translation initiation.

Mussels from vents and seeps show a variety of distribution patterns from highly restricted to the very broad range shown by B. septemdierum in this study. The known occurrences are indicated on Fig. 2: from Monowai Volcano in the southern Tonga Arc to Mokuyo Seamount in the Izu-Bonin Arc, a distance of 7500 km; from Monowai to KA the range is 9600 km. These ranges are the largest known among vent animals. Such a broad distribution has some precedent in the study of Olu et al. (2010) who recognize pan-Atlantic populations of two seep mussel species. Bathymodiolus is known to have species with long-lived larvae that may spend over a year in a teleplanic mode (Arellano and Young 2009), which, in the case of B. septemdierum, may be able to maintain an extended metapopulation in the Pacific through larval exchange. Furthermore, Arellano et al. (2014) show that these larvae perform ontogenetic vertical migrations to the ocean surface where faster currents would transport them over longer distances. While larval movement is necessary to connect populations, it is not sufficient. Habitat must be suitable and available. The wide depth range recorded for this species (1100 m to nearly 3500 m) suggests broad adaptation. But there are many vent sites in the Pacific where other species of Bathymodiolus may exclude B. septemdierum, for example, B. manusensis in the Manus Basin (MB) and B. platifrons in the Okinawa Trough.

Community similarities between the Indian Ocean and W. Pacific vents (Van Dover et al. 2001) indicate historical connectivity either north of New Guinea (Hessler and Lonsdale 1991) or from the South East Indian Ridge via the Macquarie Ridge Complex and Kermadec Arc to the W. Pacific. Alternatively, gene flow between Indian and Pacific systems is likely active via the Indonesian Flowthrough and supported by venting on subsea volcanoes in the Molucca and Banda Seas (e.g., McConachy et al. 2004). While these connections would generally support the low genetic divergence found in our study, sea level drops during past glacial periods might have temporarily isolated Indian and Pacific populations (Barber et al. 2000), which could have contributed to the differences seen in mitochondrial and Cat variants. Despite possibly long planktonic larval periods, the results from our BAYESASS analysis suggested a high degree of isolation between all investigated vent regions and sites, at least over the last two generations. Given the shallow population structure in the B. septemdierum complex, these patterns imply that contemporary gene flow exists, but that it occurs over a longer time frame in an indirect way. This hypothesis would agree with a recent modeling study of coral larval drift pathways (Wood et al. 2014), which shows that present connectivity in the Indo-Pacific area can be achieved in a stepping stone fashion. Consequently, it is plausible that the opportunistic timing of metamorphosis, the availability of sufficient intermediate habitats and an effective establishment at new settlement sites rather than long-distance dispersal per se are the driving forces for the broad biogeography of this species.

Studies of other Indo-Pacific vent taxa suggest that contemporary gene flow is limited in the W. Pacific back-arcs for some species and that migration routes are often asymmetric. For example, hidden genetic diversity and species crypticism was detected in gastropods of the genus Alviniconcha, where either two or three species were found to co-occur in NFB (A. kojimai, A. boucheti), MB (A. kojimai, A. boucheti) and LB (A. kojimai, A. boucheti, A. strummeri) (Johnson et al. 2014). Populations of the snail Ifremeria nautilei were undifferentiated within and between NFB and LB, but genetically distinct from MB sites based on mitochondrial and microsatellite markers (Thaler et al. 2011). As with our findings, the authors observed weak asymmetric movement from LB to NFB, in accordance with prevailing ocean currents. We also identified limited gene flow from LB to VA, suggesting that LB might function as a source population for Bathymodiolus and Ifremeria metapopulations in other W. Pacific back-arc and arc volcano settings. This hypothesis may be relevant to potential impacts from mining of seafloor massive sulfides in the South Pacific. Nonetheless, contrasting outcomes of gene flow studies in other co-distributed species highlight the need for comprehensive genetic and life history investigations of vent taxa and their likely dispersal routes (reviewed in Vrijenhoek 2010). Vent metacommunity dynamics will influence behavior of both reserves and mined areas after anthropogenic disturbances. Making useful recommendations for sustainable mining plans would almost certainly be illusory, if data are based on a limited number of species. Thus, even if our results suggest that sites in the W. Pacific back-arc basins have high gene flow on a longer time scale, it will be necessary to determine settlement success of foreign larvae, characterize source, sink and intermediate stepping stone sites and measure actual recovery rates of already perturbed vent habitats to assess the impacts of mining activities and design management units for biodiversity conservation (Boschen et al. 2013).

A drawback of our study stems from sampling only one locality per back-arc basin in most cases; so, true levels of within-basin differentiation are uncertain. Consequently, sampling of multiple populations at a finer scale will be required to detect cryptic population structure—a phenomenon that is becoming increasingly apparent in hydrothermal systems (e.g., Kojima et al. 2001; Vrijenhoek 2009; Shank and Halanych 2007; Johnson et al. 2014; Puetz 2014). Multilocus genotyping approaches tend to be more informative than morphometric or unigenic methods, but they are unable to elucidate the whole picture of genetic subdivision. Next-generation technologies like RAD sequencing provide a promising and cost-effective solution to this obstacle by uncovering restriction site associated SNPs across the entire genome without the need for completely sequencing it (Davey et al. 2011; Wang et al. 2012; Reitzel et al. 2013). Future studies could exploit this method to reveal genomic differentiation, unambiguously assigning natal origins of recruiting larvae, and identifying genes involved in speciation (e.g., Puebla et al. 2014), thereby helping to understand the molecular mechanisms that underlie biodiversity dynamics at hydrothermal vents and improving attempts to mitigate impacts of deep-sea mining in these unique ecosystems.


The present analysis identified recently separated regional metapopulations: mussels on the Central Indian Ridge (comprising the morphotypic species B. marisindicus) and mussels in the W. Pacific back-arc spreading centers (comprising the morphotypic species B. septemdierum, B. elongatus and B. brevior). Discordant patterns of gene flow inferred from the genetic markers probably reflect the different evolutionary histories of the investigated loci. Importantly, the shared genetic variation between populations seemed to be a result of past and infrequent or indirect contemporary migration events (stepping stone connectivity), as virtually no evidence for immigration was detected over the last two generations.

Our results have significant ramifications for biodiversity conservation in the Indo-Pacific region, where deep-sea mining projects are currently ongoing and exploration is rapidly increasing. Additional studies using highly informative markers and targeting a variety of taxa with different life-history characteristics are necessary to assess the degree of present gene flow on a genome-wide, multi-species level. Detailed knowledge about repopulation rates, intermediate settlement sites and other mechanisms influencing connectivity will further be crucial for designing efficient management plans for the protection of hydrothermal vent ecosystems.