The genetic diversity and divergence of avian populations inhabiting the Arctic generally seems to be low compared with those breeding at lower latitudes (Hewitt 2000; 2004; Weider and Hobæk 2000). This is usually attributed to historical changes in the geographic ranges of the species, associated with the cycles of glaciations during the Quaternary (Hewitt 2000; 2004; Martin and McKay 2004; Adams and Hadley 2013). The shifting of species ranges could reduce the genetic diversity and/or divergence of populations through several mechanisms, such as reduction in population size (Hewitt 2000, 2004) and/or population mixing, particularly in combination with possible selection for increased dispersal during the recolonization periods (Martin and McKay 2004 and references therein). At present, limited availability of habitat suitable for breeding in the Arctic causes disjunctive breeding locations, which together with intrinsic barriers, such as site fidelity and philopatry, can reduce the birds’ dispersal and lead to genetic structuring of populations (Friesen et al. 2007). In consequence of all these processes, the extent of genetic patterns of Arctic bird populations is quite variable. Identifying and quantifying the patterns of genetic diversity and differentiation of Arctic bird species may help to understand the evolutionary paths and ongoing population processes in this ecosystem (Avise 2000; Crandall et al. 2000).

Arctic species at remote locations continue to be a challenge to investigate and are thus underrepresented in the literature. This is particularly visible in meta-analytical studies of biogeographic patterns of genetic variability, where high-latitude birds are scarce (e.g., Martin and McKay 2004; Weir and Schluter 2007). Therefore, the investigations of truly Arctic species are highly valuable. In this study, the phylogeography and population genetics were examined in a small, colonial seabird, the little auk (dovekie) Alle alle, the breeding range of which is entirely restricted to the high-Arctic.

The little auk is believed to be one of the most numerous seabirds of the world (>37 million pairs, Kampp et al. 1987; Mehlum and Bakken 1994; Isaksen and Bakken 1996; Boertmann and Mosbech 1998; Stempniewicz 2001; Egevang et al. 2003). The majority of the global population is concentrated in the Atlantic sector of the Arctic, with the largest colonies located in Greenland and the Svalbard archipelago (Stempniewicz 2001; Fig. 1). Much smaller colonies are located on the southern (Jan Mayen and Bjørnøya) and eastern borders (Novaya Zemlya and Severnaya Zemlya) of the breeding range, while only small numbers have been reported to breed in the Pacific sector of the Arctic, namely the Bering Strait (Diomede Island, and possibly on St Lawrence Is. and King Is.) and the Bering Sea (St Matthew Is. and Pribilof Is.; Day et al. 1988; Montevecchi and Stenhouse 2002). Two subspecies have been recognized: the nominate Alle a. alle and A. a. polaris. The latter is larger (Stempniewicz et al. 1996; Wojczulanis-Jakubas et al. 2011) and inhabits Franz Josef Land and possibly Novaya Zemlya, whereas the nominate race occurs over the rest of the breeding range (Fig. 1). Although the little auk is considered a keystone species of the Arctic ecosystems (Stempniewicz 2005; Stempniewiecz et al. 2007), the population genetics of this species has never been studied.

Fig. 1
figure 1

Distribution and relative size of little auk breeding populations (indicated by black circles) sampled in the study. The maps were produced with Ocean Data View; relative population sizes are based on Stempniewicz (2001). See Table 1 for colony abbreviations

Some genetic differentiation of the little auk populations might be expected, given the insular pattern of distribution of the little auk colonies (Fig. 1) in combination with presumably high rate of nest-site fidelity (K. Wojczulanis-Jakubas, unpublished data and per analogiae to related species of similar life-history traits, reviewed in Divoky and Horton 1995). Moreover, intraspecific differentiation in body size (Stempniewicz et al. 1996), including some differentiation within the nominate subspecies (Wojczulanis-Jakubas et al. 2011), suggests that breeding colonies may constitute isolated populations.

Materials and methods

Sampling and laboratory procedure

Blood, feathers, or tissues were collected from 328 little auks from ten breeding locations in the Atlantic and Pacific oceans (Table 1; Fig. 1). Blood samples (25 μL) were taken from birds captured in the colony during the breeding season in the years 2001–2010. The blood was preserved in 96 % ethanol until DNA extraction. Feathers were collected from birds found dead in the colony in Franz Josef Land. Tissue samples were collected in the Alaska Museum from skins and two frozen specimens (voucher catalog numbers: UAM 1000, UAM 3399, UAM 4883, UAM 5263, UAM 5384, UAM 13203, and UAM 27041). All sampled birds were adults (at age >2 years, distinguished from subadults by the appearance of wing covers and flight feathers, Stempniewicz 2001) and presumably unrelated to each other.

Table 1 Numbers of little auks analyzed for microsatellite (Msats) and mitochondrial markers (mtDNA) for ten breeding colonies, ordered according to longitude and latitude

DNA was extracted from blood samples using kits designed for blood samples (Blood Mini, A&A Biotechnology, and EZNA, Omega Bio-tek). For feathers and frozen tissue, DNeasy Tissue kits (Quiagen) were used. Three DNA samples from four distant locations, and comprising the two little auk subspecies, were selected for sequencing the mitochondrial control region (A. a. alle: the Atlantic area—MH_SVA, IS_SVA; and the Pacific area—DI_ALA; A. alle polaris: the Atlantic area—FJ_RUS; see abbreviation codes in Table 1). The primers were designed based on the sequences of the mtDNA control region available in GenBank for the little auk and the closely related common and Brünnich’s guillemots and razorbill (Alca torda) (AaCRF1: 5′cctgaattttcacattcccttt and AaCRR1: 5′ttatgcccaacaagcattca). The following protocol was used for the sequencing reaction: PCR reaction volume was 15 μL, containing 0.6 mM dNTPs, 0.03 U/μL Dynazyme II DNA Polymerase (Finnzymes), 1× buffer (10 mM Tris–HCl, 1.5 mM MgCl2, 50 mM KD, and 0.1 % Triton X-100; Finnzymes), 0.5 mM primer, and 3 μL DNA extract. The conditions of the reaction were as follows: 5 min at 94 °C; 30 cycles of 30 s at 94 °C, 30 s at the annealing temperature, and 30 s at 72 °C; and final extension period of 10 min at 72 °C. Cycle-sequencing reactions were carried out using an ABI PRISM BigDye Terminator v1.1 Cycle-Sequencing Kit, with reaction volume of 10, 1 μL primer, and 5 μL PCR product. Cycle-sequencing products were run on an ABI PRISM 3100 Genetic Analyzer following the manufacturer’s instructions (Applied Biosystems).

For microsatellite genotyping, 325 samples were screened for variation at six loci (Table 1). Initially, ten primer pairs were tested for cross-species amplification of a subset of samples (n = 96). Of that, Apy03, Apy08, Apy14, Apy06, and Apy09 were originally developed for the whiskered auklet, Aethia pygmaea (Dawson et al. 2005); Uaal-23 and Uaa5-8 were developed for the common guillemot, Uria aalge; and Ulo12a-12, Ulo12a-22, and Ulo14b-29 were developed for the Brünnich’s guillemot (Ibarguchi et al. 2000). All markers gave a PCR product in the little auk. However, due to apparent evidence of null alleles or not sufficient polymorphism of four loci, only six (Apy03, Apy06, Apy08, Apy14, Ulo12a-22, and Ulo14b-29) were included in further analyses (Table 2). Polymerase chain reactions (PCRs) were performed in volumes of 10 μL [containing 0.1 μL dNTPs (0.6 mM), 0.1 μL Dynazyme DNA Polymerase (0.3 U Finnzyme), 1 μL buffer (10 mM Tris–HCl, 1.5 mM MgCls, 50 mM KD, and 0.1 % Triton X-100; Finnzymes), 0.5 μL of the two primers (0.5 mM), and 2 μL DNA extract]. Forward primers were labeled with HEX, NED, or FAM. The reactions were conducted under the following conditions: 5 min at 94 °C; 30 cycles of 30 s at 94 °C, 30 s at the annealing temperature (Table 2), and 30 s at 72 °C; and a final extension for 10 min at 72 °C. The PCR products were run on an ABI PRISM 3100 Genetic Analyzer following the manufacturer’s instructions (applied Biosystems) and scored in Genemapper 3.0 (Applied Biosystems). To control the genotyping precision, the same two reference samples were added to each PCR and genotyping run.

Table 2 Characteristic of microsatellite loci used in the little auk

Statistical analyses

MtDNA. The homology of the sequences was confirmed using BLAST (Altschul et al. 1990) by comparing the obtained sequences with sequences available in GenBank from alcids, including the little auk (Alle alle AJ242684; Alca torda AJ242683; Uria aalge AJ242686; Uria lomvia AJ242687). The sequences were quality checked and trimmed to the same length in BioEdit (Hall 1999). The sequences could be unambiguously aligned without inserting gaps. Alignment statistics and mitochondrial DNA polymorphism, quantified as the number of haplotypes (n), haplotypic diversity (h), and nucleotide diversity (π), were calculated in DnaSP v5.10.01 (Librado and Rozas 2009) and Mega v5.2 (Tamura et al. 2011).

For phylogenetic analyses, the most appropriate model of mtDNA sequence evolution was determined, and the nucleotide substitution parameters were estimated by jModelTest 2 (Darriba et al. 2012). The TPM2 model was selected by decision theory performance-based selection (DT) and Bayesian (BIC) information criterion. A maximum likelihood consensus tree was calculated in the PhyML v. 3.0 software (Guindon and Gascuel 2003), under the general settings of the selected models with 500 bootstrap iterations. An Alca torda control region sequence (GenBank AJ242683) was used as out-group to root the tree. However, in case of mitochondrial markers with high intrapopulation polymorphism, construction of classic phylogenetic trees is often not appropriate, resulting in uncertainties of the relationships between haplotypes. In such cases, estimating a network of haplotypes connected by a minimal number of mutational steps may be a better solution. Therefore, relationships between control region haplotypes were reconstructed using the median-joining algorithm (Bandelt et al. 1999) in Network v4. 6.1.0 ( This method groups related haplotypes through median vectors into a tree or network. Different settings for the homoplasy level parameter, ε, were tested, and ε = 20 was eventually used. To account for differences in substitution rates, weight of 1 for transitions and 2 for transversions was applied. Ambiguous relationships were resolved with a maximum parsimony (MP) heuristic algorithm.

Microsatellites. The presence of null alleles at microsatellite loci was tested with MICRO-CHECKER 2.2.3 (van Oosterhout et al. 2004). To determine the extent of distortion from independent segregation of loci, Arlequin 3.11 (Excoffier et al. 2006) was used. The Arlequin 3.11 was also used to test deviations from Hardy–Weinberg equilibrium (HWE), with exact P values being estimated using the Markov chain Monte Carlo (MCMC) procedure with 100,000 dememorization steps.

The genetic diversity of each colony was described by the mean number of alleles per locus, observed and expected heterozygosity, and the inbreeding coefficient F IS, all calculated in FSTAT (Goudet 2002). Allelic richness (R) and private allelic richness (R PA) across the colonies were calculated using HP-RARE (Kalinowski 2005). Genetic variation existing among and within colonies was analyzed with an analysis of molecular variance (AMOVA), using ARLEQUIN 3.11. The genetic differentiation among the colonies was estimated based on the F-statistics of Weir and Cockerham (1984) using pairwise comparison tests in ARLEQUIN 3.11. Significance level of multiple pairwise comparisons was adjusted using Benjamimi–Yekutieli correction (Narum 2006). As F ST strongly depends on within-population heterozygosity (Meirmans and Hedrick 2011), the standardized measure of genetic differentiation F ST (Hedrick 2005) was calculated, using RecodeData v.0.1 (Meirmans 2006) to transform the data set.

To infer population structure (i.e., the number of distinct genetic groups or clusters), Bayesian clustering analyses were performed in STRUCTURE 2.3.4 (Pritchard et al. 2000) using both “admixed” and “not admixed” models for ancestry. The first model is recommended as a starting point for analyses, while the second may be better to detect a subtle structure (Pritchard and Wen 2004). The correlated allele frequency model, which is supposed to improve clustering for closely related populations (Pritchard and Wen 2004), was applied. Sampling locations as prior information were used to assist clustering, since this approach is recommended for data sets where a signal for structure is expected to be weak (Hubisz et al. 2009). The burn-in length was set to 100,000 followed by 1,000,000 iterations of the MCMC estimation procedure. The analyses were run for each value of K (number of clusters) from 1 to 9. The interpretation of the true value of K was based on the size of the mean log likelihood of K (Pritchard and Wen 2004). Additionally, to estimate contemporary levels of gene flow, probability of individual-based assignment to the origin and other colonies was estimated using GeneClass 2.0 (Piry et al. 2004).

The relationship between genetic (matrix of pairwise F ST) and geographical distance was analyzed to test an isolation by distance pattern. Since the most distant colony (DI_ALA) was also poorly represented in terms of sample number and quality, and span of the sampling period (7 museum samples collected over 42 years), the analysis was performed both with and without this colony. Both analyses were performed using Mantel tests with Euclidean similarity measure and 5,000 permutations (Legendre 2000) in PAST 1.87 (Hammer et al. 2008). The analysis evaluates the statistical significance of the correlation between two or more distance matrices, using permutation tests (Telles and Diniz-Filho 2005). The matrix of geographical distances was built from the shortest distances between colonies, calculated using the measurement tool in Google Earth 6.2.1.


MtDNA. The alignment of the mitochondrial control region sequences of A. alle produced 537 sites covering Domain I and CSB regions, 13 of which were variable. Although nuclear copies of mitochondrial genes have been reported in other seabirds, little auk sequences did not differ from models expected for true mtDNA (Baker and Marshall 1997; Friesen et al. 2005). The observed pattern of sequence evolution corresponded to higher variation in the region of Domain I and to slower rate of substitution in conserved sequence blocks (CSBs; Baker and Marshall 1997). Ten control region haplotypes (h = 0.955, π = 0.005) were found among 12 individuals of both little auk subspecies (deposited in GenBank, KC899681–KC899692). The haplotypes differ by one to two substitutions with no insertions/deletions.

The constructed ML phylogenetic mtDNA tree indicated very shallow, recent phylogeny. Most of the subdivisions had low support (data not shown). Haplotypes of individuals assigned to different subspecies from different localities were mixed in the tree. In the median-joining network, the total number of identified haplotypes was 10 (Fig. 2). The most frequent haplotype was represented by three individuals [two individuals of A. a. polaris (FJ-RUS1 and FJRUS4) and one of A. a. alle (MH-SVA100)], while nine other haplotypes occurred in single individuals. No reciprocal monophyly was observed for samples representing the two subspecies, and no clear phylogeographical pattern was revealed.

Fig. 2
figure 2

Median-joining network of the little auk mtDNA control region haplotypes. Circle size is approximately proportional to the number of individuals exhibiting the corresponding haplotype. Connector length is proportional to the number of mutations between haplotypes. More than one mutational step is indicated by Arabic numbers

Microsatellites. Results of the basic analyses of the six loci suggested the presence of linkage disequilibrium in the data set. However, the overall pattern of linkage disequilibrium was not consistent with that observed in separate colonies (Table S1). There was some evidence of departure from HWE in four loci, but the pattern of deviation was again not consistent across colonies (Table 2 and Table S2). Given the inconsistencies in the patterns of both linkage disequilibrium and deviation from HWE, all loci were considered appropriate for the population genetic analyses and used in the further investigation (Table 2). Analyses based on smaller subsets of loci [with 5 loci (excluding Apy 03 that seemed to be in linkage disequilibrium with two other loci in the pooled data set) and with 3 loci (excluding Apy 06, Ulo 12a_22, and Ulo 14b_29 that showed deviation from HWE in some colonies)] gave qualitatively similar results in comparison with the analysis of all loci.

All six loci were polymorphic (Table 2), with allelic richness in colonies ranging from 3.00 to 8.37 (Table 3). There were consistently high levels of observed heterozygosity (ranging from 0.66 to 0.83) across all colonies (Table 3). The inbreeding coefficients (F IS) were generally low, but significant for three colonies (MH_SVA, MA_SVA, and KF_SVA; Table 3). Private allelic richness was low but variable, ranging from <0.01 at DI_ALA to 0.93 at HI_GRE (Table 3).

Table 3 Genetic variation at six microsatellite loci in the nine breeding colonies of the little auk

Global AMOVA revealed low, but significant genetic differentiation, with average F ST = 0.005 over all loci (P = 0.03), even when excluding DI_ALA from the analysis (mean F ST = 0.003, P = 0.02). However, it should be noted that most of the variance resided within populations (99.55 %). Pairwise F ST values ranged from −0.014 to 0.036, with significant differences (P < 0.05) between ten colony-pairs (Table 4). When Benjamimi–Yekutieli correction (critical P = 0.012) was applied, only half of these differences were significant (Table 4). Pairwise standardized F ST ranged from −0.165 to 0.130 (Table 4).

Table 4 Pairwise F ST and F ST values between nine little auk colonies below and above diagonal, respectively, based on six microsatellite loci

Clustering analyses revealed no genetic structure. Regardless of the model used, one cluster (K = 1) had the highest value of posterior probability (P ≈ 1). The probability of the correct assignment of individual genotypes to their colony of origin was low, ranging from 0.162 to 0.430, and generally similar to that of the adjacent colony, ranging from 0.001 to 0.496 (Table 5). The only exception was the DI_ALA colony, where the probability of assignment to other colonies was generally much lower (0.001–0.014) than the probability to assignment to colony of origin (0.22) (Table 5).

Table 5 Average assignment probability of individuals to origin or adjacent colony

A significant positive correlation was found between genetic (F ST) and geographic distance matrices (Mantel test, r = 0.43, P = 0.04, Fig. 3), however, that was not significant when DI_ALA was excluded from the analysis (r = 0.44, P = 0.15).

Fig. 3
figure 3

Pairwise F ST values plotted against geographic distance between the studied colonies of the little auk. White circles denote pairwise relationships for the most distant colony at Diomede Island (DI_ALA)


High genetic diversity was found in the little auk. Almost all individuals had a unique haplotype, and also the level of heterozygosity in microsatellites markers was consistently high in all colonies. However, the overall level of genetic differentiation of the populations was very low, with an average fixation index (F ST) of 0.005 and low probability of the individual assignment to its origin colony. Consequently, one genetic cluster was proposed for all colonies. These features place the little auk in the middle of a continuum of genetic variation in the Arctic avifauna. At one end of this continuum, there are species such as dunlin Calidris canutus that has been found to harbor considerable genetic diversity, with a very pronounced phylogeographic structure (Wennerberg and Bensch 2001). On the other extreme, there is red knot C. canutus, for instance, that exhibits very low genetic diversity and shows no genetic structure, despite wide range of breeding distribution (Buehler and Baker 2005).

Even though some genetic structuring could be expected given the disjunctive distribution of the breeding colonies and high nest-site fidelity in the little auk, the weak differentiation found in this study is similar to results from other seabird species with similar breeding range and biology. In particular, little or no genetic differentiation between populations has been reported for other alcids inhabiting the Arctic, e.g., Atlantic puffin (Fratercula arctica, Moen 1991), Atlantic subspecies of Brünnich’s guillemot (Birt-Friesen et al. 1992; Morris-Pocock et al. 2008), common guillemot (Moum et al. 1991; Friesen et al. 1996; Moum and Arnason 2001; Riffaut et al. 2005), and marbled murrelet (Brachyramphus marmoratus, Congdon et al. 2000; Friesen et al. 2005).

Various demographic and historical factors may contribute to the lack of genetic structure in the little auk. First, following the periods of changes in range and size, populations require time to establish equilibrium between mutation, migration, and genetic drift (Whitlock and McCauley 1999). The global little auk population is believed to have expanded significantly only recently, after extermination of the main food competitor, the bowhead whale (Balena mysticetus), in the eighteenth century (Węsławski et al. 2000). Given this, along with the relatively recent origin of Arctic habitats (reviewed in Hewitt 2000), the population of the little auk may be in a nonequilibrium state. This argument was put forward to explain the weak population structure in the common guillemot (Friesen et al. 1996) and the razorbills (Moum and Arnason 2001).

Second, the weak level of differentiation of the little auk population may be the consequence of the evolutionary history of the species, including the pattern of the northward expansion of the population after the deglaciation. A long-term isolation into different refugia during the Pleistocene ice ages is usually invoked to explain genetic divergence of populations of Arctic birds (e.g., dunlin, Wennerberg and Bensch 2001). In contrast, species that do not exhibit genetic structuring, such as Atlantic Brünnich’s guillemot and razorbill, are believed to have expanded north from a single, southern refugium (Friesen et al. 1996; Moum and Arnason 2001). This latter scenario seems to apply equally well to the little auk. Also, given the current geographic distribution of little auk colonies, concentrated mainly in a relatively narrow sector of the North Atlantic (Stempniewicz 2001), the populations appear to have arisen through expansion of a single homogeneous refugia population.

Finally, high current intercolony dispersal would be enough to prevent genetic structuring of the population, as dispersal tends to oppose the effect of genetic drift and homogenize populations (Slatkin 1989). Nest-site fidelity is presumed to be high in the little auk (K. Wojczulanis-Jakubas, unpublished data), but natal dispersal (movement between the natal and recruitment site) may efficiently prevent genetic structuring. There is no information about philopatry and the extent of natal dispersal in the little auk, but it has been shown that in other alcids some young may breed away from their natal colony (e.g., Halley and Harris 1993; Harris et al. 1996; Olsson et al. 1999; Harris and Swann 2002). In particular at wintering grounds, there is high potential for mixing between birds from different breeding sites during the nonbreeding season. Some winter recoveries of little auks ringed in the breeding colonies in western Spitsbergen and northwestern Greenland indicate that waters off southwestern Greenland are an important wintering area for both of these populations (Isaksen and Bakken 1996). Also, a recent study on birds equipped with geolocator tags showed substantial overlap of the wintering areas among birds from breeding grounds on east and west Greenland and the Svalbard archipelago including Bear Island (Fort et al. 2013).

Despite apparently low genetic differentiation of the global population of the little auk, some pairwise F ST comparisons of the breeding colonies showed significant differences (Table 4). The significant correlation of the F ST values with geographic distance suggests a pattern of isolation by distance (Avise 2000). Accordingly, the most distant colony at Diomede Island (DI_ALA) presented the highest values of pairwise F ST and F ST (although not always significant, possibly due to the low number of individuals sampled at Diomede Island). The distance between the Diomede colony and the nearest neighboring colony on NW Greenland is about 3,000 km, and it makes intuitive sense that the most distant population is also most differentiated from the others.

The low quantity and inferior quality of the samples from Diomede Island may cast doubt on the validity of these results. Most of the samples from this site were collected from the museum specimen (possible effect on DNA quality), and the sampling period spanned about 40 years (possible effect of time on the population differentiation, reviewed in Balloux and Lugon-Moulin 2002). However, there are several lines of evidence suggesting that samples from Diomede Island yielded valid results. First, the samples from this site were successfully sequenced for the control region of mtDNA, 537 bp long, indicating sufficient quality of DNA. Also, the size of microsatellite alleles from Diomede Island fitted well within the overall range of all loci, and the number of unique alleles for that colony was rather low compared with the others, giving no indication of false alleles. Moreover, given the long life span of the little auk (at least 15 years, K. Wojczulanis-Jakubas unpublished data), and associated long generation time, the long period of sampling should not affect the frequency of alleles in the population. Nevertheless, further examination of the little auk population from Diomede Island would be recommended, especially given the interesting genetic differentiation found in the present study and the small size of the population (Day et al. 1988; Montevecchi and Stenhouse 2002).

Departure from HWE was found in all colonies but only in four cases was apparent in more than one locus (Isfjorden, Kongsfjorden and one colony from Magdalenefjorden on Svalbard, and Kap Hoegh on Greenland). Of these, linkage disequilibrium between two different pairs of loci was found for Isfjorden and Kongsfjorden colonies. Although deviations from HWE or distortion from linkage equilibrium might be a random effect associated with the sample size and/or level of marker polymorphism, these two features combined may indicate some population distortion. In general, the high levels of linkage disequilibrium and deviation from HWE in populations are associated with small effective population size, significant intrapopulation genetic structure, and occurrence of inbreeding (Li and Merilä 2010). Given the fact that the Isfjorden and Kongsfjorden colonies are among the smallest in the Svalbard area, it is possible that the observed deviations reflect ongoing demographic processes in these two populations.

There was no indication of a segregation of mtDNA haplotypes according to the two described subspecies of little auks, despite their apparent morphological differences (Stempniewicz et al. 1996; Wojczulanis-Jakubas et al. 2011). This may be due to recent divergence of the two subspecies and/or high degree of current gene flow between them. Whether the morphological differences have a genetic basis and are sufficient to merit subspecific status cannot be assessed with our data. Clearly, there is a need for a more comprehensive molecular analysis to further explore the validity of little auk subspecies.