Anonymous nuclear loci (ANL) are markers in non-coding regions of the nuclear genome that are unlikely to be under selection and which have many potential applications in molecular systematics, especially at lower taxonomic levels (Karl and Avise 1993). Traditionally, ANL discovery involved time-consuming lab-work to generate a small number of usable markers, but recently Bertozzi et al. (2012) described a bioinformatics pipeline to develop candidate ANL from next generation sequencing (NGS) data.

Caecilians (Gymnophiona) are limbless, mostly soil-dwelling amphibians largely restricted to the moist tropics, and the approximately 200 extant species comprise one of the most poorly known major vertebrate groups (Gower and Wilkinson 2008; Wilkinson 2012). Anonymous nuclear markers have not previously been developed for caecilians, microsatellites have been developed for only two species (Li et al. 2010; Barratt et al. 2012), and published coding nuclear data are not very variable at lower taxonomic levels.

A radiation of six nominal caecilian species in three genera (Grandisonia, Hypogeophis, Praslinia) occurs in the Seychelles (Nussbaum 1984; Wilkinson and Nussbaum 2006; Wilkinson et al. 2011). Although clearly monophyletic (Nussbaum and Ducey 1988; Hedges et al. 1993; Gower et al. 2011), analyses of (mostly mtDNA) sequence data have been unable to robustly resolve all relationships among the radiation (Hedges et al. 1993; Wilkinson et al. 2002, 2003; Loader et al. 2007; Gower et al. 2008, 2011). ANL could provide a useful tool for Seychelles caecilian systematics and conservation genetics, especially given the increased levels of threat faced, in general, by island biotas (e.g., Frankham 2008). Two Seychelles caecilians (P. cooperi and H. brevis) are classified as Endangered on the IUCN Red List.

Six samples from five Seychelles caecilian species (all Seychelles species except P. cooperi; two samples of G. alternans) were selected for marker development using NGS. Genomic DNA was extracted from liver using Qiagen DNeasy Blood and Tissue kits. Samples were prepared using a standard Illumina Nextera DNA kit and paired-end reads (≤251 bp long) sequenced using a 500 cycle v.2 reagent kit on the Illumina MiSeq platform. Paired-end data were combined and cleaned using default settings in Geneious v.6.1.4. Cleaned files were run through Bertozzi et al.’s (2012) Perl bioinformatics pipeline, with our customised BLAST database comprising the Xenopus tropicalis and Danio rerio genomes plus all caecilian entries in GenBank. Each sample was run through the pipeline individually (see supplementary material). Candidate anonymous nuclear markers were selected at random from paired reads ≥245 bp, and 36 primer pairs (6 per sample) were designed using Primer3 v.0.4.0 (Koressaar and Remm 2007; Untergrasser et al. 2012). Primer sequences were selected on the basis of being closest to the beginning of the 5′–3′ end of each read (to maximise length), and having low self-complementarity and annealing temperatures of 60 °C (±1 °C); all other settings were default. Primer sequences were subjected to an additional BLAST search to check locus anonymity.

The 36 primer pairs were tested using the polymerase chain reaction (PCR) for five genomic DNA samples, one for each of the non-P. cooperi Seychelles species (see supplementary material). Reaction volume was 25 μl: 1 μl template, 1 μl for each primer, 9.5 μl of dd H2O, 12.5 μl of MyTaq Mix ×2. Cycling conditions for all primer pairs were: 95 °C-3 min; 35× [95 °C-15 s, 60 °C-15 s, 72 °C-20 s]; 72 °C-10 min. Fifteen of the 36 primer pairs successfully amplified DNA in all five species, and amplicons for these were subjected to Sanger sequencing. Assembled and edited sequences were aligned using default settings for consensus alignments in Geneious. Eight of the 15 loci were considered suitable for further testing; the seven loci rejected at this stage generally yielded poor sequences, perhaps indicative of suboptimal primer/template combinations and/or PCR settings. Sequences were subjected to a BLAST search to check anonymity.

The eight surviving candidate ANL were tested further by attempting PCR amplification of genomic DNA from 12 additional individuals of the five species for which they had already worked plus two individuals of their Seychelles sister species P. cooperi (see supplementary material). Descriptive statistics were generated using DnaSP v.5.10 (Librado and Rozas 2009) for seven ANL (see Table 1); all were variable across the Seychelles species. One locus was excluded from DnaSP analysis because it failed to sequence well in any specimen of H. rostratus or G. larvata.

Table 1 The seven anonymous nuclear markers developed successfully in this study

Our NGS data initially produced approximately 5,000 candidate ANL with a potential length of ≥245 bp. Approximately 20 % of the 36 candidate ANL that we subsequently randomly selected were found to generally amplify and sequence well and showed variability in the Seychelles caecilians. Our successful approach differed from Bertozzi et al.’s (2012) in that we developed ANL from Illumina NGS data rather than from the more expensive Roche 454 platform, albeit at the expense of sequence length.