Polar Biology

, Volume 30, Issue 4, pp 513–521

The utility of fast evolving molecular markers for studying speciation in the Antarctic benthos

Authors

    • Alfred Wegener Institute for Polar and Marine Research
  • Florian Leese
    • Alfred Wegener Institute for Polar and Marine Research
    • Lehrstuhl für Spezielle ZoologieRuhr-University of Bochum
Original Paper

DOI: 10.1007/s00300-006-0210-x

Cite this article as:
Held, C. & Leese, F. Polar Biol (2007) 30: 513. doi:10.1007/s00300-006-0210-x

Abstract

The Southern Ocean is surprisingly rich in species that coexist in one of the most extreme environments on Earth yet the processes leading to speciation in this ecosystem are not well understood. To remedy this, tools that measure the genetic connectedness within a species are needed. Although useful for phylogenetic purposes, the readily available mitochondrial markers (e.g. 16S, COI) suffer from numerous shortcomings for population genetics. Therefore, molecular markers are needed that are sufficiently variable, unlinked, biparentally inherited, and distributed over the whole genome. We argue that microsatellites are suitable markers that have not been widely used in exploratory studies due to their difficult initial set-up. Working with the Ceratoserolis trilobitoides species complex (Isopoda), we demonstrate that using a novel protocol many microsatellites can be identified quickly. An increased availability of these highly sensitive markers will be useful for studies addressing the origin of species in the Southern Ocean and their response to future climate change.

Keywords

Antarctic benthosPopulation geneticsPhylogeographyMicrosatellitesSpeciation

Introduction

The fauna of the Southern Ocean differs substantially from those of all other oceans due to extreme environmental conditions (Aronson and Blake 2001; Clarke 1983; Peck 2002). Despite these, an extraordinarily high species diversity within the zoobenthos is now well established (Gray 2001) and molecular methods continue to add to our knowledge about the number of species and their phylogenies.

The identification of species is particularly difficult in the Southern Ocean and, perhaps not surprisingly, molecular data have identified cryptic species in many taxa (Bernardi and Goswami 1997; Held 2003; Held and Wägele 2005; Raupach and Wägele 2006). It is unclear at this point whether this represents the Antarctic share of a more general phenomenon that currently receives attention in the scientific community (cryptic species and DNA barcoding; see Hebert et al. 2003; Tautz et al. 2003) or whether cryptic speciation in the Southern Ocean is driven mainly by processes unique to the Antarctic.

Until now, the taxa receiving most attention from molecular phylogeneticists have been fishes (Bargelloni et al. 2000; Lecointre et al. 1997; Near et al. 2003, 2004; Ritchie et al. 1996, 1997), molluscs (Page and Linse 2002), echinoderms (Lee et al. 2004), and crustaceans (Baltzer et al. 2000; Clarke et al. 2004; Held 2000; Jarman 2001; Jarman et al. 2000, 2002; Loerz and Held 2004; Patarnello et al. 1996; Raupach et al. 2004; Zane et al. 1998; Zane and Patarnello 2000). Although primarily aimed at resolving phylogenetic questions, interpreting the molecular trees in the light of additional sources of information (mainly biogeography and geology) has allowed answering questions broader in scope concerning colonization of habitats, origins of distributions, evolution of adaptations, and speed of molecular substitution on an evolutionary timescale (Held 2001; Near 2004; Stankovic et al. 2002). This phylogenetically motivated approach has proven fruitful and we continue to see a place for it in the future.

However, while our knowledge on the identity, diversity, and distribution of modern species in the Southern Ocean continues to grow, our understanding of what led to the high number of species in the first place is less comprehensive. Typical adaptations in the life cycles of Antarctic taxa (e.g. few descendants, no pelagic distributional stages, and high immobility of adults) can be assumed to decrease gene flow thus enhancing the possibility of speciation. On the other hand, the consequences of frequent periodical glaciations of major parts of the Antarctic continent can counteract this by eradicating most of the benthic communities through large-scale spread of grounded ice (Thatje et al. 2005). This could also have structured genetic diversity by lineage sorting in the isolated, ice-free refugia.

In order to understand which of the above-mentioned processes or adaptations may have led to the disruption of a previously contiguous gene pool and ultimately led to the generation of new species in the evolutionary past, a more comprehensive knowledge of what is structuring the present-day distribution of genetic polymorphisms within a species becomes necessary. Molecular methods and population genetics in particular can provide a valuable contribution to building a microevolutionary framework that will enable us to finally measure population interconnections.

Markers with higher rates of change than single locus coding genes (including the mitochondrial genome) are required to measure population genetic parameters such as gene flow through migration and dispersal. Despite the fact that the vast majority of the Antarctic ecosystem is composed of benthic invertebrate animals (Gutt et al. 2004), such markers are only available for a handful of vertebrates and pelagic invertebrates in the Southern Ocean (Hoelzel et al. 2002; Reilly and Ward 1999; Roeder et al. 2001; Shaw et al. 2004; Valsecchi et al. 1997). To our knowledge, the Antarctic benthos is completely unstudied in this respect.

The aim of this paper is twofold. On the one hand, we would like to emphasize the emerging necessity to develop more fine-scaled molecular markers in an attempt to understand the genesis of the surprising diversity of species in the Antarctic benthos. On the other hand, we demonstrate that even in almost unknown genomes suitable molecular markers, which are sufficiently variable to allow for microevolutionary studies of intraspecific patterns of DNA polymorphism and gene flow, can be isolated in a standard molecular lab in a relatively short time. Our study subject is a new species found in the isopoda genus Ceratoserolis Brandt 1988, a taxon serving as a model for the Antarctic zoobenthos.

Materials and methods

Here we outline the strategy of obtaining microsatellite markers from a relatively unknown genome (in this case from the Ceratoserolis trilobitoides species complex).

In short, the genome of the species under study (target genome) is cut into shorter fragments. In order to identify the fragments containing short tandem repeats (microsatellites), they are hybridized against immobilized genomic fragments of an unrelated species (reporter genome). The stringency of the subsequent washing steps is such that only fragments with higher similarity than what can be expected by chance are retained. Since the two genomes show little similarity by descent, the greatest similarity between them occurs often between regions of low complexity such as microsatellites. The retained fragments of the target genome will therefore likely be enriched in microsatellites.

For more technical details and an evaluation of the properties of this and other protocols for the isolation of microsatellites, the reader is referred to another paper (Leese et al. 2006, submitted) and the original papers describing the method (Nolte et al. 2005; Zane et al. 2002).

Sampling

Specimens belonging to the C. trilobitoides complex were collected by hand-sorting bottom trawled gear on three expeditions to the Antarctic: ANT XIII/3 and ANT XIV/2 in 1996 with RV Polarstern, and ICEFISH 2004 with RV Nathaniel B. Palmer. Specimens were preserved in cold 80–96% ethanol shortly after sorting the catch. Genomic DNA was extracted with the Qiagen DNeasy Kit according to the manufacturer’s protocol. For details concerning preservation and DNA extraction methods, see Held (2000).

Molecular genetic techniques

Three different hybridization filters with single stranded reporter genomes of Mus musculus domesticus, Drosophila melanogaster, and Homo sapiens were prepared according to Nolte et al. (2005). DNA from 18 specimens from six populations of the C. trilobitoides species complex was pooled. Digestion of this template (1 μg) and ligation to the AFLP adaptors (5′-TACTCAGGACTCAT-3′/5′-GACGATGAGTCCTGAG-3′) were carried out simultaneously with an excess of 10–20 U of the MseI isoschizomer TruI and 10–30 U T4-DNA ligase in a 1× Buffer-R reaction mix (100 μl). Both enzymes have different optimal working temperatures (65 and 22°C). In accordance with the suggestions of the supplier, the reaction was incubated at an intermediate temperature (37°C) for 6 h. Genomic fragments were subjected to a 10-min incubation at 72°C in a complete PCR mix without primers (1× Taq Buffer (1.5 mM Mg2+), 0.2 mM dNTPs, 2 U Taq Polymerase, all products Eppendorf) to eliminate nicks remaining from the adapter ligation. Of this mixture, 5 μl was PCR amplified in a total volume of 50 μl with the AFLP adaptor-specific primer MseI-N (5′-GATGAGTCCTGAGTAAN-3′) in a Techne thermocycler (94°C 4 min, 25 cycles of 94°C 30 s, 53°C 1 min, 72°C 1 min, and a final elongation step of 72°C for 7 min). Fragments from this preamplification were size-selected (400–800 bp) on a 1.5% agarose gel. Reactions were purified from the gel using the Eppendorf Perfectprep Gel Cleanup Kit. To eliminate shorter fragments trapped in the gel matrix, electrophoresis and subsequent purification of the PCR products from the gel were repeated twice.

Amplified fragments containing microsatellites were selectively enriched by hybridizing 300–500 ng against immobilized single-stranded DNA fragments of the reporter genomes bound to nylon membranes as described in Nolte et al. (2005). Three washing steps were performed to keep only those fragments of the target genome that formed stable duplexes with the immobilized reporter genome, indicating high similarity in low-complexity regions.

Thereafter, the hybridized fragments were eluted using 30 μl of TE buffer (10 mM Tris–HCl, 1 mM EDTA, pH 7.5).

Enriched fragments were amplified again as above in 25 μl reaction volumes to obtain a sufficient concentration for cloning (10–100 ng) and purified from the gel as above. The purified enriched fragments were now cloned into a pCR®2.1 TOPO® TA vector (Invitrogen), and transformed into competent E. coli (Promega JM 109, TOP10F’).

Initially, the colonies were PCR screened for microsatellites using a combination of primers located in the multiple cloning site of the vector and a third primer anchored in the repeat sequence of the insert, if present (Lunt et al. 1999). In subsequent runs, this confirmation step was omitted since the majority of the inserts (approximately 90%) contained microsatellites so that all inserts were sequenced without further screening. Sequencing using M13 forward and reverse vector primers was either conducted in-house on a LiCor 4200 or ABI 3130xl automated sequencer, or outsourced to Macrogen (Seoul, Korea). To improve signal quality for sequences containing long microsatellites, DMSO was added to a final concentration of 5%. BLAST searches were carried out for all sequences to exclude the possibility of contamination with mobilized fragments from the reporter genome.

Scoring criteria

Microsatellites in the insert sequences were identified with a newly written detection program “Phobos” version 2.3 beta (Mayer et al. 2006, submitted) which makes use of the algorithm in the program “Sputnik” (Abajian 1994).

Microsatellites with repeat units ranging from 2 to 6 nucleotides in length and with a percentage perfection of ≥90% were included in the search. The minimum number of repeat units for a microsatellite to be included in Phobos’ output was set to five repeat units for dinucleotide repeats, four repeat units for trinucleotide repeats, and three repeat units for all other repeat types. Repeats shorter than this threshold or containing more than the allowed 10% non-repeat nucleotides were ignored. For statistical analyses of microsatellite properties such as length and perfection characteristics, the Phobos output was analysed by the complimentary program “ssr-stat” (see Leese et al. 2006, submitted for details).

All other settings were according to the recommendations given by Chambers and MacAvoy (2000).

Results

In total, 371 colonies with inserts enriched for microsatellites using the cross-hybridization technique were screened. Among these, two colonies contained redundant fragments (i.e. identical to other inserts) and were excluded from analysis. Of the remaining 369 colonies, 326 colonies contained inserts with at least one microsatellite (88.3% positives). The remaining 43 colonies either contained no microsatellites or only repeats shorter than our scoring threshold outlined above (11.7% negatives). According to these criteria, a total of 781 microsatellites were detected within the 326 positive inserts (Ø 2.4 ± 1.6). A maximum number of ten independent microsatellites were detected within a single 652 bp insert.

Among the different repeat families, dinucleotide repeats were the most frequent (66.6%), followed by tetranucleotide (16.2%), pentanucleotide (11.4%), hexanucleotide (2.9%), and trinucleotide repeats (2.8%). As many as 54 different repeat types were found in total. Dinucleotide microsatellites were typically composed of more repeat units per microsatellite than microsatellites with longer repeat units (tri- to hexanucleotide repeats), although length frequency distribution varies between repeat motifs of identical unit length, too (Table 1).
Table 1

Types, total number and relative frequency of microsatellites detected within the genome of Ceratoserolis trilobitoides sensu lato screening 369 non-redundant inserts. The Ceratoserolis genome was hybridized against immobilized genome fragments of Mus musculus, Drosophila melanogaster, and Homo sapiens according to the reporter genome protocol by Nolte et al. (2005). Further details in Leese et al. 2006, submitted)

Type of microsatellite

Frequency of microsatellite types

Average number of repeats (mean ± SD)

Number of different motifs

n

Percent of total

Dinucleotide

520

66.6

25.7 ± 26.7

4

Trinucleotide

22

2.8

8 ± 8.8

9

Tetranucleotide

127

16.3

11.6 ± 15.5

17

Pentanucleotide

89

11.4

13.2 ± 15.6

13

Hexanucleotide

23

2.9

8 ± 7.3

11

The amount of perfect microsatellites, characterized by arrays of a certain repeat unit that are not interrupted by nucleotide substitutions, was 51.5%.

Discussion

Identification of microsatellites is a constant matter of discussion since no consensus has been reached about what a microsatellite actually is (Chambers and MacAvoy 2000; Ellegren 2004). Consequently, the absolute number and length frequency distribution of microsatellite arrays vary with the scoring criteria used to detect microsatellites. This topic is further discussed elsewhere (Leese et al. 2006, submitted).

Shortcomings of mitochondrial genes as molecular markers in studies of speciation and gene flow

Traditionally, molecular phylogenies and interpretations derived from these trees including phylogeography (Avise 2000) have relied extensively on relatively conserved genes because these were easiest to target because even in unknown genomes conserved regions that allow the construction of the so-called universal primers can be relied upon (Simon et al. 1994). However, a corollary of the requirement for conserved stretches of nucleotides serving as universally applicable priming sites is that there is both an upper limit for the variability of the molecular markers located between them as well as a limit in the number available. In addition, the most frequently used genes in the mitochondrial genome suffer from additional limitations. In summary, the utility of mtDNA for population genetic purposes is limited because it is short, all of its genes behave as a single, linked locus due to the lack of recombination, it cannot a priori be assumed to evolve in a strictly neutral fashion, it is generally maternally inherited, effective population size for mtDNA is only one-quarter of nuclear DNA, and its rate of change is limited (Ballard and Whitlock 2004). This in turn limits the application of genes located in the mitochondrial genome approximately to the level of species and the phylogenetic relations between them (Fig. 1).
https://static-content.springer.com/image/art%3A10.1007%2Fs00300-006-0210-x/MediaObjects/300_2006_210_Fig1_HTML.gif
Fig. 1

Different types of molecular markers possess different rates of change and hence resolve different time windows. While slower genes are required to resolve the deeper bifurcations in a phylogenetic tree, faster and more finely resolving markers are better suited for more recent events such as identification of species. In order to study the level of populations or individuals, which are necessary to understand speciation, molecular markers with rates of change that exceed the rates of coding genes are needed. Each class of molecular markers covers a range of substitution rates and may differ between loci. The order at which the classes of markers appear in the right half of this figure should be taken as an approximation

Populations, gene flow and speciation

In evolutionary biology the term “population” is loosely used to describe members of a species living in a certain area, or more specifically, non-random, variable patterns of genetic connectivity inside a species (Waples and Gaggiotti 2006). According to the latter, two individuals are likely to be more similar in some respect if they belong to the same population and more different if they belong to different populations, either as a result of adaptation to similar environments (selection) or as the result of stronger isolation (genetic drift) between them. It is noteworthy that even if populations are defined on a geographical basis, only the degree of (genetic) isolation between them is often implied. Either way, populations that are sufficiently isolated from one another have the potential to accumulate local traits (morphological, behavioural, physiological, or genetic) that are different from the ones found in other populations (Coyne and Orr 1998).

Most species concepts postulate that for speciation to occur the once contiguous gene pool must be disrupted, either as a primary requirement or as a consequence. The formation of species can thus be interpreted as the accumulation of differences beyond a stage that ensures reproduction and the continuation of a common gene pool for all populations.

However, high levels of gene flow between populations synchronize genetically mediated traits between them and consequently oppose the process of differentiation (drift or selection) and ultimately speciation. Therefore, the amount of gene flow within a species is a critically important baseline information in the light of which all scenarios concerning the separation of a single ancestor species into its daughter species must be discussed.

Excluding false negatives

Although some of the faster mitochondrial loci (e.g. 16S rRNA, COI, mitochondrial control region) do conserve intraspecific genetic polymorphisms which under favourable conditions can be interpreted in the context of population genetics (e.g. Roman and Palumbi 2004; Zane et al. 1998), these loci are mostly informative over larger distances (e.g. intercontinental). The main limitation of loci with limited substitution rates becomes obvious when dealing with cases of perceived homogeneity between samples. Homogeneity can be the result of two fundamentally different reasons. Either two populations are indeed genetically indistinguishable as a result of high levels of gene flow between them or short times since their separation. On the other hand, they may just appear so because the marker’s rate of change is inadequate to detect existing genetic differences (false negative: no evidence for differentiation when in reality there is). The impossibility of proving homogeneity is a fundamental limitation and not a property of a particular class of molecular markers. It is advisable to interpret results indicating homogeneity between samples with caution. The absence of a proof (for differentiation between populations) is no proof for the absence of such differentiation for the reasons outlined above. By analysing many independent loci covering a broad range of substitution rates (including very high ones), however, the risk of falling victim to false negatives can be greatly reduced.

Fast evolving markers to the rescue

A better marker for population level studies would thus consist of many, unlinked, neutral loci distributed across the genome. This would avoid the potentially misleading effects of mitochondrial genes. It would also ideally consist of loci with different levels (including very high levels) of substitution rates (Gaffney 2000).

There are several classes of markers that come close to these requirements (see Fig. 1). Random amplified polymorphic DNA or RAPDs have lost much of their appeal because it has been shown that the method is unreliable and has a tendency to produce artefactual bands not inherited from the parents (Ellsworth et al. 1993; Perez et al. 1998). Restriction fragment length polymorphism (RFLP) is a more reproducible method, but when high rates of change are desirable yields result that is not easy to interpret due to the multitude of bands. If homologizing beyond mere electrophoretic mobility is desirable or necessary, excising and sequencing or hybridising of bands becomes necessary which offsets the ease of use of RFLP.

Amplified fragment length polymorphism (AFLP) and microsatellites are perhaps closest to the ideal marker for population studies (Bensch and Akesson 2005). AFLPs tend to be easier to set up initially whereas microsatellites as co-dominant, multiallelic markers have more analytical power in certain contexts (information per locus 4–10 times higher). A further disadvantage of the AFLP technique is that reproducibility of AFLP patterns can be strongly biased by DNA quality and other methodological parameters (Bensch and Akesson 2005). Other fast evolving markers with genome-wide distribution have become available recently (SNPs, microarrays). A detailed comparison with microsatellites is beyond the scope of this article, however, they do not offer significant advantages over microsatellites in terms of analytical power and especially the effort required for initial set-up (Schlötterer 2004). We are therefore only considering microsatellites (Fig. 2) for the remainder of this paper.
https://static-content.springer.com/image/art%3A10.1007%2Fs00300-006-0210-x/MediaObjects/300_2006_210_Fig2_HTML.gif
Fig. 2

A microsatellite locus from the genome of Ceratoserolis trilobitoides sensu lato. Top: Nucleotide sequence of an (AG)n dinucleotide microsatellite with 21 repeat units, each repeat unit composed of two nucleotides is represented by a black or white box. Bottom: Fragment analysis showing a heterozygote specimen with two alleles of different length (242 and 246 bp) and a homozygote with a single allele (250 bp) (black arrows). The less intense peaks indicate fragments that are a result of in vitro artefacts (slippage) during amplification and are not scored

In contrast to the universal priming sites that function across wide taxonomic groups, microsatellites are not usually flanked by highly conserved regions so that in general microsatellites have to be isolated for every species of interest de novo. Since there can be a significant difference in the frequency and the types of repeats between the genomes of different species which in ecological and systematic studies are generally only poorly known, isolating microsatellites has the reputation of being a tedious and time-consuming process.

Although the mutational processes of microsatellite arrays are still not fully understood and selection on some repetitive motifs (e.g. if located within a coding region or upstream the promoter region) has been reported (Kashi and Soller 1999; Li et al. 2002), the majority of microsatellite loci are assumed to be selectively neutral. In contrast to the mitochondrial genes that in absence of recombination behave as a single, linked locus, the null hypothesis is that microsatellites are unlinked and act as independent loci.

In summary, microsatellites are in almost perfect agreement with the requirements of a better marker for fine-scaled studies at the population level. The initial effort required for their initial set-up is perhaps the main reason why microsatellites have not been used more widely. In that sense, there is a trade-off between ease of use, reliability, and analytical power of molecular markers.

Simplified ways to obtain suitably fast markers

A variety of protocols exist for the identification and isolation of microsatellite loci. These differ markedly in their requirements (time and equipment) as well as their efficiency (Leese et al. 2006, submitted; Zane et al. 2002).

The case study presented in this paper demonstrates that it has become possible to isolate a large number of microsatellite loci from an unknown genome with reasonable effort (Leese et al. 2006, submitted). The use of a modified protocol (Nolte et al. 2005) shows that many obstacles previously associated with the isolation of microsatellites may no longer be a problem.

One of the main advantages of the reporter genome protocol is that no assumptions have to be made concerning the type of repeats to be screened. While it is entirely possible in traditional protocols to miss even abundant repeat types by not including the corresponding reporter oligomer in the screening, the reporter genome protocol will report all repeat types that are present in both reporter and target genome, a most valuable property when dealing with unknown genomes (Leese et al. 2006, submitted).

Another welcome property of microsatellites as molecular markers is that they typically require the amplification of short (100–250 bp) fragments only. This facilitates amplification of partly degraded material not collected and preserved with molecular studies in mind. The potential inclusion of museum material fixed in formalin would greatly alleviate one of the main problems of working in the Antarctic: the notoriously difficult and unpredictable acquisition of samples.

Conclusions

Fast evolving molecular markers such as microsatellites allow a far more detailed look on the processes at the population level than more conserved markers. Analyses of microevolutionary processes structuring the Antarctic benthos are needed for a better understanding of how locally characteristic traits evolve in a species in this unique environment. This opens new perspectives on the origin of biodiversity and the evolution of adaptations to the environment in the Southern Ocean. Collecting data on fast evolving loci in the Antarctic benthos is also a prerequisite for detecting a response of Antarctic benthic communities to future environmental change.

We provide evidence from one Antarctic crustacean that a new isolation protocol has brought suitably fast evolving microsatellite markers into reach for a standard molecular lab working on an unknown genome within a short time.

Acknowledgments

Arne Nolte (University of Cologne) provided his microsatellite protocol ahead of print and offered valuable assistance during the initial test phase in the lab. For computational assistance and the development of the powerful computer programs “Phobos” and “ssr-stat” we thank Christoph Mayer (University of Bochum). Wolfgang Wägele (Museum Koenig, Bonn) discussed various aspects of the project with us. We also thank Joseph Eastman, Sven Thatje, and one anonymous reviewer for valuable comments and suggestions on the manuscript. This work was supported by the DFG grant He 3391/3 to Christoph Held (AWI Bremerhaven) and in part by NSF grant OPP 01-32032 to H. William Detrich (Northeastern University, Boston, USA). This is publication number 11 from the ICEFISH Cruise of 2004.

Copyright information

© Springer-Verlag 2006