Marine Biology

, Volume 157, Issue 7, pp 1417–1431

Are there true cosmopolitan sipunculan worms? A genetic variation study within Phascolosoma perlucens (Sipuncula, Phascolosomatidae)


    • Museum of Comparative Zoology, Department of Organismic and Evolutionary BiologyHarvard University
  • Gonzalo Giribet
    • Museum of Comparative Zoology, Department of Organismic and Evolutionary BiologyHarvard University
Original Paper

DOI: 10.1007/s00227-010-1402-z

Cite this article as:
Kawauchi, G.Y. & Giribet, G. Mar Biol (2010) 157: 1417. doi:10.1007/s00227-010-1402-z


Phascolosoma perlucens is one of the most common intertidal sipunculan species and has been considered a circumtropical cosmopolitan taxon due to the presence of a long-lived larva. To verify whether P. perlucens is a true cosmopolitan species or a complex of cryptic forms, we examined the population structure and demographics of 56 putative P. perlucens individuals from 13 localities throughout the tropics. Analysis of two mitochondrial markers, cytochrome c oxidase subunit I and 16S rRNA, suggests high levels of genetic differentiation between distantly located populations of P. perlucens. At least four different lineages identified morphologically as P. perlucens were distinguished. These lineages are likewise supported by phylogenetic analysis of the two mitochondrial markers and by the haplotype network analysis. Our results suggest that P. perlucens is a case of overconservative taxonomy, rejecting the alleged cosmopolitanism of P. perlucens. However, cryptic speciation also exists in some areas, including a possible case of geminate species across the Isthmus of Panama.


Traditionally, members of the phylum Sipuncula (commonly known as peanut worms among other names) have been recognized based on limited external morphology and internal anatomy. This has often resulted in poor species descriptions–due to the limited amount of morphological variation existing in the group–and high rates of cosmopolitanism (ca. 20% of the currently recognized sipunculan species are either cosmopolitan or distributed in several major ocean basins; Cutler 1994). This high proportion of near cosmopolitan species is not a particularity of sipunculans, as it also occurs in other groups of marine invertebrates with limited morphological resolution. In such groups, morphologically delimited species may actually constitute evolutionary distinct unrelated lineages (Knowlton 1993; Klautau et al. 1999). Thorpe and Solé-Cava (1994) suggested that nominal species that are morphologically simple are associated with broader apparent geographical distribution than those with more complex morphologies. But a true cosmopolitan species must have genetic cohesiveness due to gene flow throughout its distribution (Klautau et al. 1999), whereas in most of the cases reporting a cosmopolitan distribution, it has turned out to be the result of an overconservative taxonomy (Klautau et al. 1999; Knowlton 2000; Bleidorn et al. 2006). Although cosmopolitanism in sipunculans could be explained by the limited morphological resolution, it could also be due to the presence, in many species, of a long-dispersal teleplanic larval form called pelagosphaera (Scheltema 1975, 1988; Scheltema and Hall 1975; Rice 1981).

The advent of molecular tools has helped to solve problematic identifications in groups with limited morphological resolution. Molecular techniques have thus been very helpful at revealing cryptic and sibling species among widely distributed taxa (e.g., Knowlton 2000; McGovern and Hellberg 2003; Waters and Roy 2004; Terranova et al. 2007). This particular systematic challenge is illustrated in numerous studies on marine invertebrate groups, such as sponges (Solé-Cava and Thorpe 1986; Boury-Esnault et al. 1999; Klautau et al. 1999), cnidarians (McFadden et al. 1997; Monteiro et al. 1997), ascidians (e.g., Aron and Solé-Cava 1991), cycliophorans (Baker et al. 2007), echinoderms (Lessios et al. 2001), polychaetes (Westheide and Schmidt 2003; Bleidorn et al. 2006), molluscs (Warnke et al. 2004), bryozoans (Schwaninger 2008), and sipunculans (Staton and Rice 1999).

Phascolosoma perlucens (Baird 1868) is one of three circumtropical widespread species in the genus Phascolosoma (Cutler 1994), commonly found in high abundance in shallow waters on coral rubble or in crevices of calcareous rocks (Rice 1975). The geographical range of P. perlucens includes the Caribbean region from Venezuela to Florida (Baird 1868; Fisher 1952; Ten Broeke 1925; Murina 1967; Stephen and Edmonds 1972; Rice and MacIntyre 1979), the western Pacific from Australia to Vietnam and central Japan (Selenka et al. 1883; Sluiter 1891; Shipley 1898; Augener 1903; Lanchester 1905; Fischer 1922; Monro 1931; Murina 1964; Edmonds 1980; Cutler et al. 1984), and the eastern Pacific from Panama to Northern Mexico (Fisher 1952). There are also several records from the Indian Ocean (Fischer 1922; Cutler 1965; Cutler and Kirsteuer 1968; Cutler and Cutler 1979) and a few from the eastern Atlantic Ocean (Stephen 1960), making the species virtually cosmopolitan. The revisionary work of Cutler (1994) reduced much of the confusion with P. perlucens but resulted in a long list of synonymies, which may reflect some level of over-lumping.

Sipunculan taxonomy is based on external architecture (e.g., morphology of the hooks, shields, shape of the body, tentacle arrangement) and internal anatomy (organization of the internal organs). Two main diagnostic attributes are used to differentiate P. perlucens from its congeners. The first is the reddish, conical, and posteriorly directed pre-anal papillae on the dorsal base of the introvert (Cutler and Cutler 1990), even though the same authors drew attention to the papillae constituting a character that varies with age, habitat, and/or postmortem fixation history. The second diagnostic character is the secondary rounded tooth on the posterior edge (the concave side) of the hook and differences in the internal pattern visible when a hook is examined via transmitted light. In P. perlucens, this pattern consists of a triangle at the anterior edge on the base of the hook and a clear “C” streak posterior to the triangle. However, the validity of these characters has not been rigorously tested using additional information. The only exception is the molecular analysis of Schulze et al. (2007), who included multiple individuals of P. perlucens, which did not form a clade. The combined analysis of molecular data and morphology applied by Schulze et al. (2007) pointed out the taxonomic problem inside the genus Phascolosoma, drawing attention to the fact that morphological data alone reached their limitations caused by the simple morphologies of these worms.

To our knowledge, only one study investigated cosmopolitanism among Sipuncula by examining the genetic evidence of dispersal in Apionsoma misakianum (Ikeda 1904) using allozyme data (Staton and Rice 1999). This study presented evidence for cryptic speciation, casting doubt on the cosmopolitan status of this species. Likewise, there is only one study examining population-level structure of sipunculans using DNA sequence data. Du et al. (2008), as a response to the exploitation of the commercial species Sipunculus nudus, studied the genetic diversity and population structure of this species along the cost of China using the mitochondrial gene 16S rRNA.

The goal of the present study was to evaluate whether Phascolosoma perlucens is a circumtropical cosmopolitan species or whether it is yet another case of cryptic speciation or an artifact of overconservative taxonomy. We investigated whether, in the case of sipunculan species, there are true cosmopolitan species or whether they constitute mere cosmopolitan names. In order to do this, we evaluated the population structure and demography of different geographic P. perlucens populations whose biology suggested high dispersal capability, but whose geographic distribution is discontinuous. We sequenced two regions of mitochondrial DNA (mtDNA) comprising fragments of the cytochrome c oxidase subunit I (COI) and 16S rRNA genes from species collected in five subdivisions determined according to geographical distribution.

Materials and Methods

A total of 56 specimens from recent and previous collections from 13 different geographical locations were used for this study. Because of the small number of specimens available in some localities, we grouped the localities according to larger geographical subdivisions. Six subdivisions were determined: Caribbean CA, Florida FL, Thailand TH, South Africa SA, Costa Rica CR (Pacific side), and New Caledonia NC (Fig. 1) (Table 1). Details on the collecting localities are given at Appendix 1.
Fig. 1

Collection localities of Phascolosoma perlucens. Localities were pooled according to large geographical subdivisions: Caribbean, composed of samples from Barbados (BA), Belize (BE) and Venezuela (VE); Florida (FL); Thailand (TH); Costa Rica (CR); and New Caledonia (NC). For more details, see Table 1

Table 1

Description of the collecting localities, samples divided according to a large geographic subdivision and number of individuals collected in each site

MCZ accession numbers

Collection site

Subdivision by geographical distribution


DNA100748, DNA100749

Bank Reef, Barbados




Martin’s Bay, Barbados




River Bay, Barbados




Tobacco Reef, Belize



DNA101914, DNA101915

Cubagua Island, Venezuela




Bessy Cove, St. Lucie Inlet, Florida, USA




Missouri Key, Florida Keys, Florida, USA




Peanut Island, Lake Worth, Florida, USA




Phuket, Thailand




Perrier’s Rock, St. Lucia, South Africa



DNA100820, DNA100821

Park Rynie Beach, South Africa




Isla Bolaños, Bahía Salinas, Costa Rica




Ilot Maitre, New Caledonia



CA Caribbean, CR Costa Rica, FL Florida, NC New Caledonia, SA South Africa, TH Thailand

Sample collection, DNA extraction, and sequencing

All specimens were preserved in 96% ethanol and stored at 4°C (recent collections) or −80°C (older collections) until processed for DNA extraction. DNA was extracted from a piece of tissue, preferably from the retractor muscle. For small specimens (less than 1.0 cm long by 0.5 cm wide) or brittle individuals (caused by ethanol desiccation), not allowing for easy dissection to access the retractor muscle, a piece of the body wall or introvert was used instead. The DNeasy tissue kit from Qiagen was used to extract the DNA following the instructions of the manufacturer.

A 815-base pair fragment of the mitochondrial coding gene cytochrome c oxidase subunit I (COI hereafter) was amplified using the primers designed by Folmer et al. (1994), and part of the mitochondrial ribosomal gene 16S rRNA, between 487–493 base pairs, was amplified using the primers designed by Du et al. (2008). All polymerase chain reactions (PCR hereafter) were carried out in a 25-μl reaction volume containing 2.5 μl 10× PCR buffer with 0.025 M MgCl2 (Applied Biosystems), 0.5 μl dNTP’s (10 μM), 0.25 μl of each primer (100 μM), 0.625 U AmpliTaq DNA Polymerase (Applied Biosystems), and 1 μl DNA template. The PCR program consisted of an initial denaturing step at 94°C for 2 min, followed by 35 cycles of denaturation at 94°C (30 s), annealing at 35–38°C for COI and 47–48°C for 16S (30 s) and elongation at 72°C (1 min). A final elongation step at 72°C (7 min) and a rapid thermal ramp to 4°C were applied to finalize the process in a GeneAmp PCR System 9700 (Perkin-Elmer) and Eppendorf Mastercycler epgradient. PCR products were visualized in 1–1.5% agarose gels and vacuum purified using 96-well Millipore Multiscreen® plates. Sequence reactions were performed in 10-μl reaction volume using 3.2 μl primer (1 μM), a 1 μl of ABI BigDye™ Terminator v3.0 (Applied Biosystems), 0.5 μl BigDye 5× Sequencing Buffer (Applied Biosystems), and 3.3 μl of cleaned PCR product. The reactions were run in a GeneAmp PCR System 9700 (Perkin-Elmer) using the following program: 95°C (3 min), 25× (95°C [10 s], 50°C [5 s], 60°C [4 min]), and a rapid thermal ramp to 4°C. Samples were cleaned using Performa DTR plates (Edge Biosystems) and sequenced using an ABI 3730 Genetic Analyzer.

Chromatograms were visualized, edited, and assembled using Sequencher™ 4.7 (Gene Codes Corporation© 1991–2006). Sequences from all individuals were subsequently edited using the sequence alignment editor Se-Al v2.0a11. External primers were cropped and discarded from the edited sequences. All new sequences have been deposited in GenBank under accession numbers GU190249–GU190358 and GU230171–GU230186 (Suppl. mat. Table 1).


For complete anatomical and morphological analyses, specimens should be relaxed, fixed in formalin, and preserved in 70% EtOH for long-term storage, but this procedure does not allow reliable genetic sampling. To investigate putative differences in morphology between the studied P. perlucens, we used hooks as a representative structure, which in theory should not be affected by fixation and preservation in 96% EtOH. This decision was based on two principal reasons: (1) though hooks can vary in size and slightly in shape in a single specimen, they have been considered the most consistent character within a developmental series (Cutler 1994) and (2) hooks were the structures found in most of the samples after we processed specimens for molecular analysis. One to three specimens from 10 localities (Caribbean and Florida localities were pooled as their hooks looked identical) were thus selected to represent their population (20 specimens in total). Permanent slides were prepared using hooks that were scraped off from the introvert. Pictures were taken using a Canon Power Shot S5 IS digital camera, attached to an Olympus BX50 compound microscope. All hook preparations and specimens are deposited in the Department of Invertebrate Zoology at the Museum of Comparative Zoology, Harvard University (see the list of slides in Appendix 2).

Phylogenetic analysis

As COI and 16S rRNA are part of the same locus, the analyses conducted herein examined both the individual genes and a concatenated data set including both markers. Phylogenetic analyses were conducted using parsimony and maximum likelihood (ML). For parsimony analyses, trees were generated using Direct Optimization, as implemented in POY v. 4.1.1. (Varón et al. 2009). This “one step” phylogenetic method searches for topologies that minimize the total cost creating optimal cladograms without using multiple alignments and considering indels as phylogenetically informative transformation events (Wheeler et al. 2006). Trees were generated under a parameter set where opening gaps received a value of 3 and elongation of such gaps costs 1, while base transformations received a value of 2 (3221; De Laet 2005). Tree searches consisted of 100 random addition sequence replicates followed by subtree pruning and regrafting (SPR) and tree bisection and reconnection (TBR) branch swapping and continuing with multiples rounds of tree fusing (Goloboff 1999). Jackknife values (Farris 1997) were obtained with 1000 random addition sequence replicates followed by TBR. Analyses were run for 24 h using 2 processors on a Mac Pro, Dual-Core Intel Xeon, 3 GHz.

Maximum likelihood analyses were performed using RAxML 7.0.4 (Stamatakis 2006) on the CIPRES cluster, at the San Diego Supercomputater Center. Analyses of the combined and independent datasets were performed using the GTR model of sequence evolution with correction for a discrete gamma distribution and a proportion of invariant sites (GTR + I + Γ) as selected by Modeltest v.3.7. under the Akaike Information Criterion (Posada and Crandall 1998). Sequences were initially aligned using MUSCLE 3.6 (Edgar 2004) with default parameters, and in the case of COI confirmed using protein sequence translation. Concatenation of the COI and 16S rRNA sequences was performed using PHYUTILITY (Smith and Dunn 2008). Nodal support was estimated via bootstrapping (500 replicates) (Felsenstein 1985; Stamatakis et al. 2008).

Population genetic and demography analysis

Diversity indices within population as number of haplotypes (Nh), haplotype diversity (h), nucleotide diversity (πn), number of polymorphic sites (Np), and average number of pairwise differences (k) were assessed for each gene using DnaSP 4.90.1 (Rozas et al. 2009). The average number of pairwise differences between populations and population pairwise FST, whose significance was assessed with 1,000 permutations, was also calculated using Arlequin v. 3.0. (Excoffier et al. 2005) considering first the four subdivisions with more than three individuals each (Caribbean, Florida, Thailand, and South Africa). A second AMOVA analysis was applied only for the Caribbean and Florida. To evaluate the hierarchical population structure, an analysis of molecular variance (AMOVA) was performed with Arlequin, using pairwise differences as a measure of divergence, with 16,000 permutations.

The most powerful tests in detecting a sudden population expansion, a contraction, or a bottleneck when analyzing DNA polymorphism data on non-recombining regions are Fu’s Fs and R2 for small data sets (Ramírez-Soriano et al. 2008). The neutrality Fu’s Fs (1997) and the R2 statistic (Ramos-Onsins and Rozas 2002) were calculated in DnaSP 4.90.1. Their significance was assessed with 10,000 coalescent simulations given the observed number of segregating sites.

A haplotype network was estimated using TCS version 1.21 (Clement et al. 2000), which applies the statistical parsimony procedure (Templeton et al. 1992; Crandall 1996). For the 16S rRNA data set, an alignment generated by MUSCLE 3.6 (Edgar 2004) was used to estimate the haplotype network. Plausible branch connections between haplotypes were tested at 90 to 99% connection limits for COI, 16S rRNA, and the concatenated data set, but the New Caledonia sample was always disconnected. In order to avoid disconnected samples, we chose 115 mutational steps as the connection limit for the individual data sets and 137 steps for the concatenated data set. Loops in the statistical parsimony network were resolved using the prediction of coalescent theory described by Pfenninger and Posada (2002).



Twenty Phascolosoma perlucens specimens were selected, and their hooks were analyzed to represent morphological differences. Hooks were photographed for 10 of the 20 specimens selected (Fig. 2b to 2k). The hooks of the closely related species P. albolineatum (Fig. 2a) and of a syntype of P. perlucens (Fig. 2l) were also illustrated. The slide containing the hooks of the syntype specimen (BMNH, Reg. No. 1847.12.30.11), collected in Jamaica, was made by E.B. Cutler and N. Cutler probably during the process of reviewing the subgenus Phascolosoma in the 1980s.
Fig. 2

Strict consensus of all the shortest trees obtained under parameter set 3221 from the concatenated data set at 4102 weighted steps. Numbersbelow branches indicate parsimony jackknife support values grater than 50%. Numbers above branches indicate ML bootstrap support values grater than 50%. An asterisk indicates support values of 100%. Selected hook morphologies are illustrated on the right (scale bars = 10 μm): aP. albolineatum from Thailand; bP. aff. perlucens from New Caledonia; c and dP. aff. perlucens from Thailand; eP. aff. perlucens from Perrier’s rock—South Africa; fP. aff. perlucens from Park Rynie Beach—South Africa; gP. perlucens from Costa Rica; hP. perlucens from Bessy Cove—Florida; iP. perlucens from Peanut Island—Florida; jP. perlucens from Venezuela; kP. perlucens from Bank Reef—Barbados; and l hook from the syntype of P. perlucens from Jamaica

Hooks from P. perlucens were described as having a secondary rounded tooth, on the concave side, a triangular pattern at the anterior edge on the base of the hook and a clear “C” streak posterior to the triangle (Fig. 2l) (Cutler 1994). P. albolineatum has the tip of the hook bent at an angle of more than 90°, a large bulge at the concave side of the hook instead of a secondary tooth, and a triangle and a clear streak pattern as in P. perlucens hook (Fig. 2a). Considering the descriptions and comparison between pictures, Fig. 2c and d (representing the Thailand specimens) are among the most dissimilar. These hooks have a curved tip as in P. albolineatum, a larger round tooth (an intermediate stage between a secondary tooth as in P. perlucens and a large bulge as in P. albolineatum) and a small prolongation at the anterior base of the hooks not observed in any other species. On the other hand, Fig. 2e and f (representing specimens from South Africa) are similar to the hooks studied in P. albolineatum. The last five images (Fig. 2g, h, i, j, and k) closely resemble the hooks originally described for P. perlucens, and they represent specimens collected around the Caribbean and Florida. Finally, Fig. 2b, representing the specimen collected in New Caledonia, is similar to the hooks of P. perlucens. Unfortunately, only one specimen was collected from this site, and more samples are necessary to be confident about the overall hook morphology.

Phylogenetic analyses

The parsimony analysis of both markers combined shows paraphyly of Phascolosoma perlucens, as P. albolineatum (a specimen from Thailand) forms a clade with the South African specimens (Fig. 2). A Phascolosoma perlucens clade (100% JF), including samples from Thailand and the single specimen from New Caledonia, is sister group to a clade of South African samples plus P. albolineatum. A second clade (100% JF) includes specimens from the Pacific coast of Costa Rica (100% JF) and its sister group, including all the specimens from the Caribbean and the Atlantic side in Florida.

Slight differences are found when the two markers are analyzed separately (Suppl. mat. Fig. 1). COI data alone (Suppl. mat. Fig. 1a) suggest three main groupings, a Thailand + New Caledonia relationship, the Costa Rica + Caribbean/Florida clade, and a sister group relationship of South Africa with the latter clade. 16S rRNA (Suppl. mat. Fig. 1b) suggests a relationship of the South African samples with those of Thailand and New Caledonia samples, and places P. albolineatum within this group. In addition, COI seems to contain more geographical structure than 16S rRNA. Strikingly, P. perlucens was monophyletic for COI but not for 16S rRNA, wherein two samples from South Africa (DNA100819-7 and DNA100820-2) clustered with the Thai P. albolineatum.

The maximum likelihood analysis for both markers combined (tree topology with a ln L = −9595.897793), and for the two genes analyzed independently (COI tree with a ln L = −5860.520669 and 16S rRNA tree with a ln L = −3356.107127) retrieved similar topologies to those obtained in the direct optimization analyses, with some minor differences in the arrangement of the outgroups (see ML bootstrap values mapped on direct optimization topology and trees in Suppl. mat. Figs. 2 and 3).

Phylogeography and demography of Phascolosoma perlucens

The final length of each COI sequence after trimming was 796 bp, and for 16S rRNA sequences, the final length was between 486 and 494 bp. After alignment, the 16S rRNA data set contained 498 positions. The concatenated data set of COI and 16S rRNA sequences resulted in 1294 nucleotide position. Using the total number of nucleotides from the concatenated data set, we observed that 305 characters (23.6%) were variable, and 272 (21%) were parsimony informative.

Using the 56 sequences, we recovered a total of four COI main lineages (Suppl. mat. Table 2). Each lineage was named after nucleotide number 204 in the sequenced COI fragment. Lineage A (5 individuals) was found in Thailand and New Caledonia, lineage C (2 individuals) was found in Costa Rica, lineage G (17 individuals) was found in South Africa, and lineage T (32 individuals) was found in the Caribbean and Florida.

Twenty-five unique haplotypes were recovered from the COI data set (Suppl. mat. Table 2). Two haplotypes (3 and 4) were reported for the majority of samples, with 11 and 13 individuals, respectively, from the Caribbean and Florida. Haplotype 13 from South Africa is the third most abundant, with 6 samples. The other 22 haplotypes occurred in low frequency and are distributed among all samples with one or two representatives each.

Fourteen haplotypes were recovered for the 16S rRNA sequence data (Suppl. mat. Table 3). Haplotype d includes 31 samples from the Caribbean and Florida. The second most-abundant haplotype is f with 11 samples, all from South Africa. The remaining 12 haplotypes are each represented in a single or two individuals, distributed mainly in South Africa, Thailand, and the Pacific coast of Costa Rica.

The statistical parsimony network for COI (Fig. 3a) and the combined data (Fig. 4a, b, and c) showed three unconnected networks and both of them very similar. Thailand sequence types are represented in the smallest network, with the unique New Caledonian sequence connected to it by 109 (for COI, Fig. 3a) and 132 (concatenated data set, Fig. 4a) mutational steps. A second haplotype network includes all the South African specimens (for COI Fig. 3a and for concatenated data set Fig. 4b). The largest network includes sequence types from the Caribbean and Florida with the sequence types from Costa Rica separated by 80 (COI, Fig. 3a) and 110 (concatenated data set, Fig. 4c) mutational steps. Shared haplotypes are represented only among the Western Atlantic populations (Caribbean and Florida). In contrast to COI and the concatenated data set, 16S rRNA shows a unique network (Fig. 3b) due to the lower variability in this marker when compared to COI. Two loops encompassing sequences e and g were resolved by maintaining the connection to the interior most frequent haplotypes (Pfenninger and Posada 2002). A third loop encompassing sequences g and h was not resolved. One loop detected among the concatenated haplotype network, encompassing sequence 13j was resolved according to the aforementioned criterion.
Fig. 3

TCS networks based on a COI data and b 16S rRNA data. Alphanumeric name was designated for each haplotype: numbers next to colored circles correspond to COI sequences, lowercase letters to 16S rRNA sequences (Suppl. mat. Table 1). Sampled haplotypes are indicated by colored circles; unlabeled connections between haplotypes represent a single mutational step; inferred haplotypes are indicated by a black dot or by bars with the exact value of mutational steps. Equally parsimonious connections are represented by dashed lines
Fig. 4

TCS networks based on the combined COI and 16S rRNA data. Sequences from the concatenated data set were represented by an alphanumeric symbol including a number and a letter (numbers correspond to COI sequences, lowercase letters to 16S rRNA sequences, see Suppl. mat. Table 1). Other symbols as in Fig. 3

Because of the small sample size of lineages A and C, the demography analysis was conducted considering only lineages with four or more individuals each (i.e., samples from New Caledonia and Costa Rica were excluded). In our study, Fu’s FS was negative and significant for most lineages for the COI data set, except for lineage A (TH samples), with the smallest sample (Table 2). Non-significant Fu’s FS values were recovered for all sites in the 16S rRNA data set, and low and significant Fu’s FS values were obtained for all lineages in the concatenated data set (Table 2). The R2 values were significant for the Caribbean and Florida populations corroborating Fu’s FS values for COI in concatenated data set, and these values were non-significant for the Thailand and South African samples (Table 2).
Table 2

Demography parameters and standard deviations for each lineage of P. perlucens


Subdivision by geographical distribution







Fu’s FS






0.700 ± 0.075


0.00117 ± 0.00210

0.93281 ± 0.66577



(796 sites)




0.806 ± 0.120


0.00147 ± 0.00185

1.16667 ± 0.82538







0.500 ± 0.265


0.00691 ± 0.00366

5.50000 ± 3.34344







0.882 ± 0.072


0.00238 ± 0.00044

1.89706 ± 1.13728







0.087 ± 0.078


0.00018 ± 0.00016

0.08696 ± 0.01640



(498 sites)




0.000 ± 0.000


0.00000 ± 0.00000

0.0000 ± 0.0000





0.833 ± 0.222


0.03080 ± 0.01570

15.00000 ± 8.54704







0.596 ± 0.139


0.02423 ± 0.01306

14.30882 ± 6.74932







0.723 ± 0.079


0.00079 ± 0.00014

1.01976 ± 0.70818



(1294 sites)




0.806 ± 0.120


0.00091 ± 0.00022

1.16667 ± 0.82538







1.000 ± 0.177


0.01598 ± 0.00595

20.50000 ± 11.55492







0.978 ± 0.031


0.01064 ± 0.00489

16.20588 ± 7.60194



Significant values (P < 0.05) indicated in bold type. CA Caribbean, FL Florida, TH Thailand, SA South Africa, N number of sampled individuals, Nh number of haplotypes, h haplotypic diversity, Np number of polymorphic sites, πn nucleotide diversity, k mean number of pairwise difference

Population genetics of Phascolosoma perlucens

Haplotypic diversity (h; Table 2) for both genes analyzed separately had values over 0.50 with only the Florida population (N = 9) showing no haplotypic diversity for 16S rRNA. Nucleotide diversity (πn; Table 2) values are low for all COI sequences combined and for the Caribbean and Florida populations for 16S rRNA. Thailand and South African samples present higher haplotypic diversity for 16S rRNA. Considering the concatenated data set (Table 2), all populations show high haplotypic diversity, ranging from 0.72 to 1, while the nucleotide diversity presented lower values as 0.0079 for the Caribbean samples to 0.0160 for Thailand. The number of polymorphic sites (Np) for 16S rRNA varies from 0 to 68, and for COI from 4 to 13. Pairwise FST values were high (FST > 0.90) and significant for Thailand and South Africa (Suppl. mat. Table 4). Low non-significant FST values were observed for the Caribbean and Florida samples (Suppl. mat. Table 4).

The AMOVA analysis was used to test for hierarchical population structure with each subdivision treated as a separate group (Suppl. mat. Table 5). The AMOVA results of haplotypes revealed that 95.19% of the genetic variation was found between subdivisions, while 4.81% of variation was found within subdivisions. The second AMOVA analysis was performed considering only the best sampled Caribbean and Florida populations. We considered four different localities: Barbados (N = 15), Belize (N = 5), Venezuela (N = 3), and Florida (N = 9). The results revealed that 3.86% of the genetic variation was found between populations, while 96.14% of variation was found within populations (Suppl. mat. Table 6). Pairwise FST values were low and non-significant (Suppl. mat. Table 7).


The use of genetic markers has proved an effective way to examine population structure (Bucklin and Kocher 1996), and mitochondrial DNA sequences have been used broadly to delimit species boundaries (Wiens 1999). More recently, the use of mitochondrial DNA sequences has been contentious, and two extreme viewpoints have emerged (see a review in Rubinoff and Holland 2005), one position criticizing the exclusive use of mtDNA, while others have endorsed one particular gene (cytochrome c oxidase subunit I) as a universal marker. We adhere to the conclusions of Rubinoff and Holland (2005) with respect to the use of multiple information, including mitochondrial and nuclear markers, whenever possible, or as in our case by combining two mitochondrial markers and testing them against morphological data. The ideal strategy to assess relationships should be an integrative approach using mitochondrial and nuclear genes in combination with morphological, behavioral, and ecological data (e.g., Álvarez-Padilla et al. 2009; Huber and Astrin 2009). However, our main focus of study here is intraspecific variation and not the phylogenetic patterns above the species level. Mitochondrial data, even if potentially conflict with certain nuclear data, can still describe the main components of the population structure in most groups of marine invertebrates.

Population structure and lineages of Phascolosoma perlucens

The results obtained in this study suggest high levels of genetic differentiation between purportedly conspecific populations of the widespread peanut worm Phascolosoma perlucens. We found four different lineages previously identified as P. perlucens confirmed by the phylogenetic analysis of the two mitochondrial markers. These four main clades were retrieved in the COI and combined data sets, and further corroborated by the results of the haplotype networks.

The investigation into species boundaries has often been inferred on the basis of statistical parsimony networks, in which unconnected haplotypes are interpreted as separate species (e.g., Tarjuelo et al. 2004; Uthicke et al. 2004; Addison and Hart 2005; Jolly et al. 2005; Thornhill et al. 2008). The results of our analysis revealed that there are three isolated networks with probably no gene flow between them. Some patterns furthermore suggest paraphyly of P. perlucens to the inclusion of P. albolineatum. The low and non-significant pairwise FST values for Caribbean and Florida but high and significant FST values for Thailand and South Africa, combined with the fact that most of the variation occurs between localities, support the possibility of at least three distinct species previously considered as P. perlucens, one in the Atlantic region of the Americas (perhaps connected to the Pacific American specimens), one in Southern Africa, and one uniting the tropical Indo-Pacific populations (Fig. 2). However, the lack of additional collections from intermediate populations between the sample sites prevents us from establishing species delimitations with more certainty.

Although lineage A is formed by samples collected in Thailand and New Caledonia, our results show 132 mutational steps separating both populations on the concatenated haplotype network, however, most of this variation comes from COI. The fact that the hooks from the New Caledonia specimens are more similar to those of the original description of P. perlucens than to those from the Thailand specimens may indicate the existence of two different species, but again, this needs to be studied in more detail once intermediate geographical samples become available.

The Caribbean and Florida populations, denominated by lineage T, appear panmictic, with low and non-significant pairwise FST values and with most of the variation occurring within localities. The statistical parsimony analysis of the concatenated data set (Fig. 4) shows two main haplotypes within lineage T (3d and 4d) comprising a large number of specimens. The remaining haplotypes are private or present low frequencies. This star-shaped haplotype network pattern suggests that a recent population expansion may have occurred in lineage T, as supported by the negative and significant Fu’s FS, and R2. A second interpretation for negative and significant Fu’s FS may indicate the existence of positive selection (Fu 1997).

The single haplotype network observed for the Caribbean and Florida samples, and the fact that they share haplotypes for each marker, suggests that there is connectivity among them. The dispersal capacity of P. perlucens between these two regions could be explained by the capacity for a sipunculan larva to remain in the water column for long periods of time. Scheltema and Hall (1975) estimated the age of a pelagosphaera between 48 to 125 days and calculated that it could drift 1500 km across the Northern Atlantic Ocean, assuming a current of 0.5 to 1.3 km/h. The faster ocean currents observed in tropical Western Atlantic waters (1.25 to 2.5 km; Shulman and Bermingham 1995) could facilitate dispersal of P. perlucens, maintaining the genetic connectivity between specimens from our four collecting localities (Barbados, Belize, Florida, and Venezuela). Observations of hook morphology for species collected in these four localities, and that the type locality of the species is in Jamaica, reinforce the possibility that this group of specimens could represent the nominal P. perlucens.

Even though the two specimens from Costa Rica were not included in the demography analysis, the haplotype network and phylogenetic analyses showed a close relationship to the Caribbean population. The Costa Rican specimens constitute the sister group to the Western Atlantic clade and are separated by 110 mutational steps in the concatenated haplotype network. However, the hooks are very similar to the ones observed in the Western Atlantic populations. The great molecular divergence between the specimens from these two regions but their similar hook morphology could indicate a case of allopatric speciation caused by the rise of the Isthmus of Panama (or ‘geminate’ species, e.g., Marko 2002), as shown in many other benthic invertebrates.

Morphological conclusions

Sipunculan taxonomy has often relied on morphological traits that show morphological variation within what is considered a species. Cutler and Cutler’s (1990) revisionary work resulted in the synonymy of 18 species under the name P. perlucens which was considered to be easily recognizable by the hooks with a secondary round tooth at the concave side, the internal triangle and the C-streak pattern, along with the exclusively conical, red, pre-anal, posteriorly directed papillae. Even though it is well known that certain intraspecific variation can occur, hook morphology has been considered to be consistent within species by E.B. Cutler (1994). Although only ten specimens were used to illustrate differences between hooks in this study, and explicit numerical morphometric methods were not employed to attempt quantifying this continuous character, this clearly shows that hooks carry considerable differences between the four lineages inferred (Fig. 2). Our observations support Cutler’s idea of hooks being “consistent” within populations, but not with his notion of the species P. perlucens. However, a more thorough investigation taking into account hook shape variation will be necessary to help establish morphological limits between species with more accuracy.

Funk and Omland (2003) revised the causes of species-level paraphyly and polyphyly when studying molecular data and identified several causes, including phylogenetic “errors” (such as unrecognized paralogy and inadequate phylogenetic information), population-level processes (i.e., interspecific hybridization, incomplete lineage sorting), and imperfect taxonomy. The latter includes cases of overlumped species, which are best exemplified by the 18 synonyms of P. perlucens. To investigate the odd placement of P. albolineatum with other South African specimens, we examined the pre-anal papillae of several exemplars in the South African clade. We found that the two specimens clustering with P. albolineatum (MCZ DNA100820-2 and MCZ DNA100819-7) in the combined and 16S rRNA analyses (Fig. 2) carried dome-shape pre-anal papillae, as in P. albolineatum. However, specimens from the main South African clade have papillae covering the introvert along ca. 2/3rd of its extension, unlike the pre-anal papillae observed in other Phascolosoma species. Furthermore, these papillae are conical and short, and in some specimens, some papillae appear to be directed toward the posterior end, as observed and described in P. perlucens. The papillar distribution, in tandem with the similarity of the hooks to those of P. albolineatum, makes this a potentially new morphotype, given that these traits do fit neither P. perlucens nor P. albolineatum descriptions. The confusion between this two species is thus historical. Baird (1868) described P. perlucens and P. albolineatum in the same article and commenced the P. albolineatum description with “This species is much larger than the preceding [P. perlucens], but resembles it in many respects.” Both descriptions are very similar, except that P. perlucens was collected in Jamaica and extracted from coral rubble, and P. albolineatum was found in the Philippine Islands with no reference to the habitat from whence it was found. The recent collections of specimens from South Africa were identified by at least three sipunculan specialists, all concluding that the specimens belong to P. perlucens. However, the new interpretation of the morphological characters in conjunction with the molecular data strongly suggests that the specimens from South Africa are not the true P. perlucens, and thus we choose to refer to them as P. aff. perlucens until further morphological work allows for clear delimitation of these samples. Likewise, in both phylogenetic analyses, the Indo-Pacific specimens were retrieved as a well-supported clade, sister to the South Africa clade plus P. albolinetum. Molecular data thus strongly support the existence of another possible morphotype. Until studies can demonstrate that this species is distinct from P. perlucens, we opt to refer to the Thai and New Caledonia specimens as Phascolosoma aff. perlucens, as before with the South African clade.

Concluding remarks

The simple morphology of sipunculan worms, in addition to the poor definitions of characters utilized to discern species, has resulted in a rather chaotic taxonomic system. The group is characterized by long lists of synonymies, resulting in the large number of species currently considered virtually cosmopolitan, such as P. perlucens, for which diagnostic morphological characters are unclear. The necessity of molecular methods to identify a putative species complex within the nominate species P. perlucens reflects the difficulty of delimiting species among the genus Phascolosoma based on few, ill-defined morphological characters (see also Schulze et al. 2005).

Cryptic species are defined as two or more species that have been classified as a single nominal species because they are morphologically indistinguishable (Mayr and Ashlock 1991). Based on this concept, we cannot consider the case of P. perlucens a cryptic species problem, at least with respect to the South African and Indo-Pacific samples. Our results suggest that what previous researchers identified as a single widespread P. perlucens has morphological differences that should be revisited. Our study points out that P. perlucens is probably a case of overconservative taxonomy, but we cannot rule out the possibility of cryptic or geminate speciation among P. perlucens from the Western Atlantic and Eastern Pacific regions. The conservation of hook morphology notwithstanding, there is a considerable genetic diversity between Caribbean/Florida and the Costa Rica (Pacific side) populations that needs to be further evaluated.

Our analysis of two mitochondrial genes reveals that the circumtropical cosmopolitan species Phascolosoma perlucens is probably a complex of species resulting from the mixture of overconservative taxonomy and cryptic speciation. Although our sampling is limited in number of individuals and geographical scope, it is sufficient to identify variation in hook morphology that in some localities correlates with a high genetic diversity between populations of P. perlucens. Furthermore, the three isolated haplotype networks reflect a probable lack of gene flow between the four geographical divisions. Our study confirms the taxonomic problems pointed out by Schulze et al. (2007) and shows that there are morphological characters in the genus Phascolosoma that can help molecular studies to determine species boundaries. However, a larger sample of individuals, localities, and more genes are needed before a comprehensive taxonomic revision of the genus can be attempted.


This study is dedicated to the memory of Edward Cutler who guided us through the study of sipunculans. We will always remember his friendship and passion for his beloved sipunculan worms. Anja Schulze, Harlan Dean, Joergensen Hylleberg, Mary Rice, and Ramlall Biseswar assisted with samples, and Cláudio Gonçalves Tiago with fieldwork in New Caledonia. Sónia Andrade assisted with population genetics software, and Prashant Sharma with phylogenetic analyses. This study was funded by a grant from the MarCraig foundation, which supported Gisele Kawauchi and the sipunculan research at the Giribet Laboratory. Fieldwork to New Caledonia was supported by the Putnam Expedition Grants program of the Museum of Comparative Zoology. David Paulaud from Direction de l’Environnement (Province Sud) facilitated collecting permits for New Caledonia. Bertrand Richer De Forges, Jean-Lou Justine, and Claire Goiran (IRD-Noumea) assisted with logistics in Noumea. Ann Covert and the Museum of Comparative Zoology provided additional funding and support. Three anonymous reviewers and Associate Editor Cynthia Riginos provided comments that helped to improve earlier versions of this manuscript.

Supplementary material

227_2010_1402_MOESM1_ESM.pdf (1008 kb)
Supplementary material 1 (PDF 1008 kb)

Copyright information

© Springer-Verlag 2010