Introduction

Time is an essential dimension of the ecological niche of a species1,2,3. In consequence, organisms have evolved internal (i.e., endogenous) time-keeping mechanisms to anticipate changes in the environment that recur rhythmically with high precision. One such endogenous mechanism is the circadian clock that enables organisms to anticipate diel (i.e. within a 24 h period) changes4,5. Circadian clocks continue to run under experimental conditions of constant light or darkness, and in natural environments they are entrained by diel changes in light intensity and duration. The circadian core clock in many organisms is well characterised on the molecular level as a set of genes that regulate biochemical oscillations with a period of 24 h (here called "circadian genes"5,6,7). For many species, for example fish, mammals and birds, we observe consistent variation in rhythmic behaviour on the level of the individual e.g., by being consistently early or late relative to the population8. Such so-called "chronotypes" have been linked to allelic variation in circadian genes, mostly, but not exclusively in humans9. The demonstrated links between behaviour and circadian gene regulators have made variation in rhythms particularly accessible to ecologists and evolutionary biologists10. Evolutionary interest so far has mostly focused on the gene Clk, a single copy gene that is constrained from evolving fast11. The translated protein CLOCK is a key element in the positive arm of the circadian transcription-translation feedback loop12. Clk is highly conserved among vertebrates throughout most of its sequence, except for a C-terminal exonic region that contains an often-variable poly-glutamine (poly-Q; CAG/CAA) trio-repeat motif. Length variation in this region varies between species, but in many species also shows substantial variation between and also within populations13. This variable poly-Q region has been shown to influence the transcription activating potential of the CLOCK protein in Drosophila14 and mouse15, where truncated versions of the Clk

Circadian clocks, and the molecular mechanisms that control them, are also involved in annual time-keeping, in conjunction with circannual clocks. Their main role is measurement of photoperiod (daylength)16,17, whose predictable change over the year provides an external calendar of increasing amplitude pole-wards, while being absent at equatorial regions. Annual changes in photoperiod can time key life cycle events, such as reproduction or migration, either directly or by entrainment of the circannual rhythm, for which photoperiod is the main synchronising cue18. Thus, molecular changes in clock genes may affect annual timing either directly via pleiotropic effects on circadian and circannual rhythms, or indirectly via influencing photoperiodic responses19,20. For example, in many organisms the circadian clock defines a photo-sensitive time window during which reproduction can be triggered or terminated16,17. Hence, individuals with genetic disposition for early versus late-entraining circadian clocks are expected to respond differently to a given daylength21. Recent evidence supports corresponding diel and annual chronotypes, e.g. for migration-linked behaviour in a songbird22. However, photoperiod-dependent links between diel and annual time-keeping can be complex, since daylength increases in spring, but decreases in autumn.

So far, some studies have indeed demonstrated a correlation between Clk genotype and annual-cycle traits. For example, poly-Q repeat length correlated with timing of migration in fish23,24 and birds25,26, timing of moult27, as well as latitudinal clines and breeding phenology in some species28,29,30. However, these results varied between study systems and a consistent pattern is lacking, e.g. no such correlation was detected between poly-Q repeat length and latitudinal clines in other bird and fish species31,32,33. Furthermore, no correlation was found for timing of breeding32 and migration34 in other bird species, and some species were altogether lacking detectable genetic variation at the poly-Q locus35.

This lack of consistency has raised critical responses towards candidate gene approaches2,10,31,36, especially when evaluating genotype–phenotype association of complex behavioural traits that are generally assumed to be polygenically controlled. Furthermore, circadian genes have broad, highly pleiotropic effects, similar to “house-keeping” genes10, and there is scepticism whether selection for a particular trait (e.g., timing of breeding) would modify genes with such broad functions, rather than acting on more specific physiological pathways20,37,38. In addition, caveats have been raised against investigating single candidate genes that are part of wider pathways, like the Clk gene36,39, and genotype–phenotype associations based on polymorphism of these genes. The reasons for the observed inconsistency between studies are so far not clear.

Inconsistent findings on the role of the Clk gene in different species emphasize that results from these candidate gene studies cannot easily be generalized across taxa25. For different organisms, the role of Clk likely differs, for example due to differences in circadian organization, in genetic architecture, in photoperiodic environments, or in demands on time-keeping2,8,40,41. Thus, in some organisms diel and annual timing might correlate, whereas in others separate and independent regulation may be beneficial21,22,42. Furthermore, most studies are correlational, so that genetic contributions to timing are entangled with a host of responses to environmental conditions43.

Here, we address uncertainty over Clk genotype–phenotype associations using a widely discussed subject, avian annual-cycle timing, by a combination of experimental approaches and comparison of closely related taxa. In birds, variation in allelic diversity and repeat length has been focally studied in the context of environmental and behavioural seasonality, specifically breeding latitude, annual-cycle timing and migratory distance. With increasing latitude and thus intra-annual amplitude in day length, reproductive seasons are delayed and shortened, moult timing is shifted, and migratory distance generally increases44. Convincing support for a relationship between Clk genotype and annual-cycle timing, however is scarce45,46.

Aiming to shed light on so far inconsistent results, we developed an experimental design that paired biogeographic associations with controlled experiments under common-garden conditions in captivity38. We compared closely related taxa within a species complex across wide latitudinal breeding ranges and used a common-garden experiment to disentangle inherited timing from environmental responses43.

Specifically, we investigated the relationship between Clk poly-Q polymorphism and breeding latitude, migration and annual-cycle timing in a well-studied songbird complex with a trans-equatorial distribution range, the stonechat (Saxicola ssp.; Fig. 1). Taxonomy of the former species Saxicola torquata is undergoing rapid reassessment. Currently, Saxicola is considered a species complex, of which we are studying populations or subspecies assigned to the species Saxicola rubicola (Austria, Ireland, Spain, Germany), Saxicola torquatus (Kenya, Tanzania), Saxicola dacotiae (Canary Islands), Saxicola stejnegeri (Japan), and Saxicola maurus (Kazakhstan). Henceforth, we here use the term “population” to refer to any group of individuals belonging to these taxa, except where otherwise stated. We compared Clk genotypes of nine populations with breeding ranges spanning 55° of latitude, including resident, short- and long-distance migratory populations from Europe, Asia and equatorial Africa (Table 1;47,48). In addition to their latitudinal variation these populations differ in their annual timing of breeding, moult and migration. Furthermore, stonechats breeding at the equator experience a constant 12:12 h light/dark cycle throughout the year (Fig. 1), and thus present a natural scenario that allows to contrast populations coping with annually fixed and seasonally changing daylengths in the same species complex, minimising cross-taxa comparative noise. However, previous experiments have shown that African populations still adjust annual timing to photoperiodic information in similar ways to northern congeners, a finding that was interpreted to reflect a possibly retained, ancestral pattern49.

Figure 1
figure 1

Overview of the geographical distribution and migratory behaviour within the Saxicola complex. Breeding range (yellow) and wintering range (orange) is shown for migrants, year-round range (blue) is shown for residents; the equator is indicated by a dashed blue line. Breeding location is indicated by filled circles for migratory populations and open circles for residents; arrows depict migratory direction and distance (dotted line indicates partial migrants). The circular inlay schematic illustrates key life history events during the first annual cycle of a stonechat’s life, starting with hatching, followed by moult, autumn migration (in case of migrants), wintering period, return migration in spring, and breeding, before the cycle starts all over again. Focal timing events investigated in this study are highlighted in blue. Bird illustrations show the European S. rubicola, the Fuerteventura S. dacotiae, the African S. torquatus, the Siberian S. maurus and the Japanese S. m. stejnegeri86 taxa.

Table 1 Breeding latitude, migration distance [km], Clk poly-Q repeat length frequencies (Q8–15), Hardy–Weinberg exact test p-values, observed heterozygosity (HO) and gene diversity (GD) of nine stonechat (Saxicola spp.) populations listed by increasing latitude.

A subset of four of these populations was studied under a common garden setting; specifically: residents (Kenya), partial migrants (Ireland), short-distance (Austria) and long-distance migrants (Kazakhstan). This approach required to keep all main known modifiers of timing consistent between populations. Thus, photoperiod, the main synchroniser of annual cycles, was set to simulate European daylength, under which all populations perform appropriate, population-specific annual activities47,49,50. Our annual-cycle perspective ranged from late summer (postjuvenile moult), through autumn migration to spring migration. We investigated Clk-related differences in timing with the following key objectives:

  1. (1)

    Investigating geographic patterns of allelic diversity. Assuming that Clk genotype affects annual-cycle timing, we expect allelic diversity to vary with latitude and migration behaviour (here quantified as mean migration distance). We tested this hypothesis using all nine populations. If individuals within the population differ in timing or in their photoperiodic exposure, we would expect to find high allelic diversity in Clk. Evolutionarily, such differences could result from fluctuating selection on timing. For increasing breeding latitude and migration distance, we might expect increased allelic diversity due to fluctuating mortality linked to arrival timing and wintering latitude51,52. In equatorial populations, constant photoperiodic conditions, but inter-annually variable breeding opportunities could either lead to canalisation and favour fixation of an optimised circadian Clk genotype, or, conversely, diversified time-keeping and hence, a broad range of genetically determined chronotypes.

  2. (2)

    Investigating associations between repeat length and population-level timing. If Clk allele lengths affect annual timing, we expect to detect differences in allele lengths with breeding latitude and migration distance in the nine populations of wild birds. Equatorial populations may again deviate from patterns at higher latitudes because of the absence of photoperiodic change.

  3. (3)

    Investigating associations between Clk genotype and individual-level timing of annual cycle traits. We test directly for genotype–phenotype associations using the four captive populations kept under identical photoperiodic conditions. Because Clk allele lengths might affect annual timing via photoperiodic time measurement, we examine whether relationships between repeat length and timing reverses between autumn (decreasing daylength), and spring (increasing daylength). Finally, because the role of Clk in annual timing might differ between populations, e.g., due to changes in photoperiodism, we test for population-specific relationships.

Results

Clk gene diversity, breeding latitude and migration distance (objective 1)

We characterised Clk poly-Q allelic lengths variation in nine breeding populations and identified eight length variants ranging from 8 to 15 poly-Q repeats (Clk poly-Q8–15; subscript indicates the number of poly-Q repeats) found at medium to high frequencies (Table 1). The most common allele (MCA) was Q13 for Kenya, Tanzania, Ireland, Japan, and stonechats from Kazakhstan; and Q14 for Spain, Austria, Germany and the Canary Islands (Table 1). In both African as well as the Canary Islands populations MCAs accounted for the majority of observed alleles, resulting in reduced population specific allelic diversity: Canary Islands (MCA 98.4%), Kenya (MCA 84.4%) and Tanzania (MCA 88%). In contrast, the contribution of the MCA to overall allelic diversity was considerably lower in stonechats from the European and Asian continent: Kazakhstan (65.6%), Austria (61.2%), Germany (56.4%), Japan (54.5%), Spain (51.1%), Ireland (45.5%) (Table 1). Poly-Q allele frequency within stonechat taxa did not deviate significantly from Hardy–Weinberg equilibrium (p > 0.5; Table 1), except for the Canary Islands population (p < 0.01; Table 1).

Clk gene diversity was characterised as observed heterozygosity (defined as frequency of observed number of heterozygotes) as well as within population gene diversity (defined as unbiased gene diversity per sample and locus by Goudet53). We analysed Clk gene diversity across the geographical range of stonechat populations in the context of breeding latitude and migration distance, two factors that were correlated, albeit not significantly (Pearson’s correlation: R = 0.56; p = 0.119; df = 7). Equatorial populations from Africa and the Canary Islands showed lower levels of diversity for both measures compared to all other populations; highest diversity levels were found in the Irish population (Table 1). Gene diversity was predicted significantly by breeding latitude, but not migration distance in a linear model (linear regression: latitude: F2,6 = 5.417; R2 = 0.64; p = 0.017; migration distance: p = 0.694). When we excluded the Canary Island population and restricted our analysis to continental populations, lowest levels of gene diversity were found in the African populations breeding close to the equator (Fig. 2). However, this model did not account for potential differences in neutral genome-wide nucleotide diversity.

Figure 2
figure 2

Clk locus specific and genome wide diversity of stonechat populations relative to breeding latitude. Gene diversity at the polymorphic Clk locus for the nine populations of this study is indicated by open circles (left axis); genome wide nucleotide diversity (autosomal pi) is plotted by black squares (right axis) for five populations with available genome wide statistics54. Clk gene diversity is lower in the African populations breeding at the equator than in the populations that breed at higher latitudes in Europe and Asia.

We thus additionally analysed the five populations for which genome-wide nucleotide diversity (defined as mean autosomal pi) was available (Kenya, Canary Islands, Austria, Ireland and Kazakhstan; estimates from Van Doren et al.54). This analysis tentatively evaluated if the pattern we observed in Clk gene diversity is due to selection in that particular region of the genome or the result of whole genome elevation of nucleotide diversity. When adding autosomal pi as a covariate to our models for this subset of populations, we neither found a correlation of Clk gene diversity with breeding latitude or migration distance, nor with genome-wide nucleotide diversity (linear regression: F3,1 = 1.717; R2 = 0.84; latitude: p = 0.305; distance: p = 0.522; autosomal pi: p = 0.589).

Clk repeat length, breeding latitude and migration distance (objective 2)

We detected no significant relationships of breeding latitude or migration distance of our nine stonechat populations with Clk poly-Q repeat length of a total of 716 individuals (mixed effects linear model with origin as a random factor; latitude: slope estimate ± SE: 0.12 ± 0.18; df: 6.63; p = 0.515; migratory distance: slope estimate ± SE: − 0.083 ± 0.16; df: 7.61; p = 0.625); see Supplementary Material, Fig. S1).

Genotype–phenotype association: annual-cycle timing (objective 3)

To test for associations between Clk genotype, characterised as the mean number of poly-Q repeats at the variable locus, and timing of different focal traits we used the four captive populations and included all available data from a common-garden experiment47 (for sample sizes per model, see Fig. 3). We ran linear mixed effects models including Clk repeat length, origin, sex, and hatch date, as well as selected two-way interactions, for onset, peak and end of moult and spring and autumn migratory restlessness.

Figure 3
figure 3

Timing of moult and migration in relation to population-specific Clk gene mean repeat length in stonechats. Onset, peak and end of postjuvenile moult (ac) and of migratory restlessness exhibited during autumn migration (df) and spring migration (gi) are shown in relation to mean repeat length for four populations: Kenyan stonechat populations plotted in red, Austrian in blue, Irish in green, and Kazakh in purple.

Timing of postjuvenile moult

During this first annual-cycle stage in young birds, timing correlated positively with Clk allele length for onset, peak and end in the Kenyan population (Fig. 3a–c, Table 2). These three time points were delayed by 9, 11 and 12 days per additional poly-Q repeat, respectively. In contrast, the relationship between Clk mean allele length and timing of the Austrian, Irish and Kazakh populations differed significantly from those of Kenyans and was slightly negative (Fig. 3, Table 2). Hatch date had significant effects on moult timing, and this association was population-specific (Table 2). In contrast, moult timing was not affected by sex.

Table 2 Relationship between Clk repeat length and annual cycle timing. Estimates from nine linear mixed effects models are shown for the analysis of postjuvenile moult, and autumn and spring migratory restlessness. Reference population for the estimates is Kenyan, reference sex is male.

Timing of autumn migratory restlessness

For the subsequent annual-cycle stage, autumn migration, we found a positive relationship between Clk allele length and timing (Fig. 3d–f, Table 2), which did not differ significantly between populations. Slopes were steepest in Kenyan stonechats, whose onset, peak and end were delayed by 33, 25 and 11 days per poly-Q repeat, respectively. Slopes in the remaining populations were far less steep, but overall positive, except for end of autumn restlessness in Irish stonechats (Table 2). Sex showed no association with autumn timing.

Timing of spring migratory restlessness

Population-specific patterns for spring migratory restlessness were similar (Fig. 3g–i, Table 2) to those observed for moult timing, and were clearest for the onset of restlessness. Kenyan stonechats delayed onset, peak and end per poly-Q repeat by 17, 12 and 13 days, respectively, whereas Austrian, Irish and Kazakh populations showed no association or slightly advanced timing with increasing poly-Q repeat. Additionally, we confirmed protandry (i.e. males started migratory restlessness slightly earlier than females) during spring migration (Table 2).

Discussion

We characterised Clk gene poly-Q variation in 950 records from 717 individuals from nine closely related populations of stonechats, including residents, short-distance and long-distance migrants. The latitudinal range covered by these populations also included equatorial populations, which to our knowledge, were not investigated in previous studies. Our study system thus newly allowed us within the same species-complex to contrast Clk gene variation across a latitudinal gradient including the equator.

All stonechat populations were in Hardy–Weinberg equilibrium, suggesting random mating within populations of the same subspecies (p > 0.5, Table 1), except for Canary Island stonechats, which diverged from the mainland stonechats 1.6 mya55 and are endemic to the island of Fuerteventura56. A bottleneck event in the Canary Island population after colonization may explain their low genetic diversity54, and their resident behaviour57, along with non-overlapping ranges with other stonechat taxa, may explain the deep genetic differentiation. This history is consistent with significant deviation from Hardy–Weinberg equilibrium (p = 0.008; Table 1) and advises caution when interpreting population genetics results from the Canary Island population.

Clk gene diversity in different stonechat populations across a latitudinal gradient revealed substantial variation in different Clk poly-Q allelic variants ranging from 8 to 15 repeats at the variable Q-locus (Table 1). Variation across study systems varies considerably and in Saxicola spp. levels of diversity at the Clk locus (here as observed heterozygosity [HO = 0.016–0.909]) and frequency of eight different allelic variants is high compared to other passerines. For example, a couple of earlier studies on a variety of study species only showed four different poly-Q variants including barn swallows (Hirundo rustica, Q5–8, HO = 0.066) and pied flycatcher (Ficedula hypoleuca, Q10–13, HO = 0.478)26 while five different poly-Q variants were reported in other species including tree pipit (Anthus trivialis, Q6–10) and nightingales (Luscinia megarhynchos, Q9–Q13, HO = 0.55]26. With exclusion of rare outliers, six was the highest number of poly-Q variants previously reported in several species, including whinchats (Saxicola rubetra, Q11–16, HO = 0.125)26, blue tits (Cyanistes caeruleus, Q9–14, HO = 0–0.637]28, bluethroats (Luscinia svecica, Q10–15, HO = 0–0.476)28 and great tits (Parus major, Q10–15, HO = 0.077)35. One possible explanation for the high variation observed in the stonechats could be that our study comprised populations that varied not only in breeding latitude, but also in their migratory behaviour.

Across stonechats, we found a latitudinal pattern in Clk gene diversity, whereby genetic variation was reduced in populations breeding at the equator. This is particularly interesting in a chronobiological context as the scope for light entrainment at the equator dramatically differs from seasonally varying light-dark cycles at higher latitudes. The two equatorial breeding populations from Kenya and Tanzania, which experience constant 12-h days throughout the year, differed from all other populations included here by significant reduction in Clk gene diversity. Previous research on African stonechats has revealed that life-cycle timing shows robust annual cycles, both in free-living birds and under restrictive experimental conditions. Captive African stonechats held under constant 12-h light–dark cycles retained clear circannual rhythms of breeding condition and moult for several years58,59. Thus, although African stonechats have retained the ability to respond to changing photoperiod49, they appear to rely on an endogenous timing mechanism. Hence, the lower variability in Clk repeat length could be signature of selection facilitating adaptation to a constant photoperiod in the birds’ environment resulting in a putatively optimised and less variable genotype58,59.

Circannual rhythms of European stonechat populations that live permanently in the northern hemisphere60 are less rigid, and the substantial changes in photoperiod they experience may play a dominant role for their annual time-keeping50,61. Hence, greater variability in Clk repeat length might have resulted from fluctuating inter-annual selection for photoperiodic timing, for example, due to rough winters or sudden weather changes51,52. Thus, we speculate that breeding populations in Germany, Austria, Spain and Ireland require integration of photoperiodism20, and thereby a higher degree of genetic diversity in Clk genotype.

We consider it unlikely that the observed genotype–phenotype patterns for the focal Clk locus are instead caused by random drift. While similar Clk gene diversities in Austrian, German and Irish stonechat populations could be due to ongoing gene flow between these populations (e.g. resulting from geographic proximity and breeding dispersal54,62), phylogeny cannot explain the similarly high Clk gene diversity we find in Kazakh stonechats, which diverged over 2.5 mya55. Furthermore, the pattern in genetic diversity we observed for the focal Clk locus differs from a genome-wide characterisation across populations, which showed similar levels across all populations except for Kazakhstan, where levels were elevated (see Fig. 2; 54). However, when we included genome-wide diversity of five of the nine populations as a covariate, we no longer observed a significant relationship between Clk gene diversity and latitude. We thus cannot exclude that the higher Clk gene diversity at higher latitudes could also be a result of higher genome-wide diversity. However, power for these results from only five populations was low, and results might be driven by the high genome-wide diversity in Kazakhstan54. The other populations showed similarly low diversity values, although the Austrian and Irish populations breed at similar latitude as the Kazakhstan population. The Kazakhstan stonechat has a vast breeding range and high effective population size63, which may have caused its outlying, elevated genome-wide diversity54. Conversely, despite its low genome diversity, the Irish population shows high Clk gene diversity. It is tempting to speculate that the particular diversity in the Clk gene is associated with the partial migratory nature of the Irish population62. Among partial migrants, the resident fraction of a population experiences different photoperiods, and typically breeds earlier in the season, compared to the migrant fraction47.

Our results on stonechats provide little support for effects of migration on Clk gene diversity, independently of breeding latitude. The only indicative effects of migratory phenotype are the differences in level of heterozygosity between migrants and residents (see HO in Table 1), where all migrant populations show at least twofold higher levels of heterozygosity compared to most residents (Table 1, Fig. 2). However, reduced levels of Clk gene diversity in the population from the Canary Islands are like explained by demographic history and although the Spanish population consists of residents, Clk gene diversity is comparably high as in populations of migrants. In addition, Clk gene diversity was significantly predicted only by breeding latitude, but not by migration distance.

In a cross-species comparative approach across different trans-Sahara migrants Bazzi et al.46 hypothesise that selection mechanisms for longer repeats in species with small northern breeding ranges could restrict their postglacially acquired Clk diversity, while selection forces in species with larger breeding ranges should be weaker resulting in higher Clk diversity. They suggest that analyses of Clk gene variation in nonmigratory African relatives of Afro-Palearctic migrants, such as included in our study, should provide additional insights for the hypothesised evolutionary scenario. Results in stonechats, however, are contrary to what Bazzi et al.46 hypothesized: genetic diversity in African stonechat is significantly lower compared to migratory stonechat populations, and thus does not provide support for this hypothesis within the stonechat complex.

A latitudinal cline in Clk gene variation with longer repeats at higher latitudes has been demonstrated in various bird systems28, suggesting a functional link between changes in daylength and Clk gene variation. However, our comparison of mean Clk length between populations showed no significant relationship with latitude, in contrast to earlier studies from different species of birds and fish23,28,46. In our comparison, specifically long-distance migratory stonechats from the Kazakh population that breed at similar latitudes as European short-distance migrants, showed high frequencies of the longest Clk alleles we observed (15% of alleles are Q15, Table 1). This observation points towards contributions of additional factors, such as environmental or climatic variation at the breeding or wintering area, or other characteristics of migratory routes, in correlative studies.

To separate genotype–phenotype associations between Clk allele length and annual timing from environmental influences, we compared captive individuals from four populations in a common garden, mimicking European daylength changes. These analyses revealed clear population-specific patterns that depended to lesser degree on time of year. Clk allele length in the equatorial stonechats from Kenya correlated positively with timing. Individuals delayed onset, peak and end of the three investigated life-cycle stages, moult and migratory restlessness in autumn and spring, by 9–33 days per additional poly-Q repeat. For migratory restlessness, timing of onset showed the closest relationship to Clk allele length. This fits with earlier findings from stonechats and other species showing that especially onset of this behaviour is under particularly strong, presumably genetic, regulation47,64.

Overall, genotype–phenotype associations were retained across a broad range of photoperiods and in traits as different as seasonal nocturnality and plumage renewal. Although sample sizes varied between populations (range: 4–23 individuals), the consistent results suggest that Clk allele length is robustly associated with individual chronotypes in the equatorial population. In contrast, the three populations breeding at higher latitudes showed weaker and seasonally variable genotype–phenotype associations. Similar to Kenyan stonechats, high-latitude individuals delayed autumn migratory restlessness with increasing poly-Q repeat numbers, but patterns were inverse for moult, and varied for spring migratory restlessness.

Overall, our results fit well with evidence that circadian clock gene variants can also be relevant for coordination of annual timing, but their roles and effects depend on species, populations and time of year65. More broadly, genetic contributions to annual cycle timing have been shown in several common garden studies, in quantitative genetic analyses and in breeding experiments43,61. Heritability (h2) estimates from captive migratory songbirds were medium to high for onset of migratory restlessness in blackcaps Sylvia atricapilla (0.34–0.45) and garden warblers Sylvia borin (0.67 onset for spring and autumn migration); heritability estimates for termination of migratory activity however were lower (0.16–0.44 in blackcaps66,67; for a summary also see68). Heritability estimates of migration timing traits from the wild are scarce, and estimates of repeatability and heritability are generally moderate, but sometimes lower, for example in Collared flycatchers (Ficedula albicollis)65. It is thus clear that to a varying degree, flexibility and non-genetic factors, such as learning, state and ontogenetic factors, also contribute to individual variation69. Inherited timing programs need to integrate information from multiple physiological pathways, such as metabolism and photic input70. Hence, evolutionary change in annual cycle timing can involve several pathways including, but not limited to, genes with known circadian roles such as Clk71.

Previous findings from the captive stonechat populations used in this study indicate high individual consistency and high heritability estimates of annual timing traits in the Saxicola complex, which also differed between populations47,50,72. Our new data suggest the possibility that the underlying genetic basis for individual timing might differ between populations from different latitudes. In equatorial populations, variation in annual chronotype might be partly regulated through a limited range of variants in the gene Clk that exert major effects, whereas at higher latitudes, Clk variants may mainly act in conjunction with photoperiodic pathways, and hence show weaker genotype–phenotype associations. Despite the lack of a mechanistic framework and reservations regarding the candidate genes approach, our results provide important and novel insight into understanding the possible genetic basis of annual-cycle timing.

In conclusion, we found a latitudinal cline in Clk gene diversity in a large dataset of 950 records from 717 individuals distributed over a wide geographical range, confirming findings in other species. Making use of a common garden setting our study highlights that the relationship between Clk polymorphism and annual-cycle timing in captive stonechats depended on population and on time of year. Our findings also allow us to speculate that in populations that live under unchanging photoperiods of the equator, Clk genotype may be less variable, but exert strong association with annual chronotype. Conversely, at higher latitude, other evolutionary forces may favour Clk gene diversity to be higher, but its association with phenotype may be obscured by additional molecular inputs into annual timing that furthermore depend on time of year.

Material and methods

Study populations

The breeding distribution of stonechats covers a wide geographical range (35° S–75° N; 20° W–180° E). Different populations exhibit a variety of migratory behaviours (Fig. 1) and differ geographically in the timing of annual processes such as breeding, moult and migratory activity47,50. In captivity, differences in timing between free-living populations largely persist when the birds are kept under common-garden captive conditions50, and even resident populations from equatorial Kenya display migratory restlessness, albeit at a low level47,50. Combined, this makes stonechats an ideal system to study associations of Clk gene polymorphisms with different processes. Here, we capitalise on the following phenotypically and geographically distinct populations to study Clk gene polymorphism within one species complex (Table 1)54,55,73:

We studied two African mainland populations [Saxicola torquata axillaris Shelley 1884: 135 individuals originating from Kenya (0°14′ S, 36°0′ E) and 52 individuals originating from Tanzania (3°5′ S, 36°5′ E)], which are year-round residents. We further included a resident island population (n = 61) endemic on Fuerteventura, Canary Islands, Spain (28°46′ N, 14°31′ W; Saxicola dacotiae dacotiae Meade-Waldo 1889; referred to as ‘Canary` in the main text 56).

We further included four populations of European stonechat Saxicola rubicola ssp: (1) one population (n = 217) sampled in Austria (Neusiedel; 48°14′ N,16°22′ E; Saxicola rubicola rubicola L. 1776), (2) one population (n = 47) from north-west Germany (Borken; 51°47′ N,6°01′ E Saxicola rubicola rubicola L. 1776;74), (3) one population (n = 184) from Ireland (Killarney; 52°N,10′ W; Saxicola rubicola hibernans Hartert 1910), and (4) one population (n = 93) from mainland Spain (Seville; 37°39′ N, 5°34′ W Saxicola rubicola rubicola L. 1776). Birds from Austria and Germany are obligatory medium-distance migrants that spend the winter in Mediterranean regions including North Africa. Irish stonechats are partial migrants that either stay on the breeding grounds year round or migrate over short distances62,75. Spanish birds originate from a population of non-migratory stonechats (Serrano, D. unpublished data).

Lastly, we included two populations of long-distance migrants, Eurasian stonechat Saxicola maurus ssp: one population (n = 150) from Kazakhstan (Kustanaj; 51.5° N, 63° E; Saxicola maurus maurus Pallas 1773) and a second population (n = 11) from Japan (Hokkaido; 43.6° N, 141.23° E; Saxicola maurus stejnegeri L. 1766). Kazakh birds migrate to India, southern continental China and North East Africa54,76, while Japanese birds migrate from Hokkaido over a continental path to winter in mainland southeast Asia77.

Sample origin

We successfully genotyped 717 individuals from the nine study populations for variable Clk poly-Q repeat length. Most samples (n = 504), including birds from populations in Austria (n = 130), Ireland (n = 112), Kazakhstan (n = 90) and East Africa (Kenya and Tanzania; n = 172), originated from a common-garden experiment that Eberhard Gwinner initiated in 1981 at the Max Planck Institute for Ornithology in Andechs, Germany, and continued until 200547,50,78. These samples were thus collected from birds born over a long time span, including some individuals that were born in captivity, and others were brought in from the wild. African and European Stonechats hatched between 1982 and 2004, with similar spread over time, whereas Irish and Siberian stonechats hatched between 1997 and 2005. Additional genotype samples (n = 213) were collected from birds sampled in the wild at their breeding grounds. Specifically, these included populations from the Canary Islands56, Spain (D. Serrano, unpublished data), Germany79 and Japan77. These samples were collected between 1998 and 2017.

All methods were carried out in accordance with relevant guidelines and regulations, all experimental procedures were approved and conformed to the relevant regulatory standards under permit (Number: 55.2-1-54-2531-119-05) by the Ethics committee of the state of Upper Bavaria. The study was carried out in compliance with the ARRIVE guidelines (https://arriveguidelines.org).

Genetic analyses

Genomic DNA was isolated from blood (n = 706) and feathers (n = 11) using a salt extraction protocol and diluted to a working concentration of 25 ng/µl. Genomic DNA samples were genotyped for length polymorphism in the variable poly-Q repeat region of the Clk gene using a polymerase chain reaction (PCR) amplification protocol and lengths characterisation of the variable region. PCR amplification was carried out in a 10 µl total volume using a previously published primer set Johnsen et al.28 (forward primer: 5′-labelled with the ‘blue’ fluorescent dye 6-FAM 5′-6-FAM-TGGAGCGGTAATGGTACCAAGTA-3′; reverse primer: 5′-TCAGCTGTGACTGAGCTGGCT-3′). PCR conditions were optimised for stonechats following conditions published in Liedvogel et al.29: Amplification reaction conditions for Taq DNA polymerase catalysed PCR for the stonechats: Mg2 +  = 2 mM; 95 °C/2 min; 95 °C/30 s, 56.8 °C/30 s, 72 °C/30 s, 40 cycles; 72 °C/5 min, 4 °C hold. PCR products were prepared for capillary electrophoresis by adding HIDI-formadide-LIZ 500 mixture. Length polymorphism of amplified products were analysed using geneious version 10.1.3. Repeat counts of lengths characterised samples was confirmed by Sanger sequencing of 50 samples across populations using an adapted version of a previous published protocol28,29 with following primers: forward: 5′-TTTTCTCAAGGTCAGCAGCTTGT-3′; reverse: 5′-CTGTAGGAACTGTTGCGGGTGCTG-3′). Amplification conditions using Taq DNA polymerase for Sanger sequencing were optimised for the stonechat material: Mg2 +  = 2 mM; 95 °C/2 min; 95 °C/30 s, 61.1 °C/30 s, 74 °C/30 s, 40 cycles; 74 °C/5 min, 4 °C hold. Nucleotide sequences of with Exo/SAP purified PCR fragment were determined with BigDye Terminator ready reaction mix, version 3.1 (Applied Biosystems) under standard sequencing conditions according to the manufacturer’s protocol29.

Genotype characterisation and population genetic analyses

We characterised Clk genotype as mean allele length (p + q/2; as previously used in comparative studies26,28) and did additional sensitivity tests using minimum and maximum allele length in our models. Results from these sensitivity tests were qualitatively the same, therefore we only present results from models with mean allele length in the main text. Genotype frequency data for all populations were tested statistically for deviation from Hardy–Weinberg equilibrium using GENEPOP 4.2 (web version; http://genepop.curtin.edu.au/; accessed 1 November 2020). We adapted to use the same Markov chain parameters that were used in previous studies28,29: dememorization (10,000), batches (10,000) and iterations per batch (10 000). For each population we calculated observed heterozygosity as a measure of Clk allelic diversity as the proportion of observed number of heterozygotes of the total number of individuals. To account for the differences in sample size we used the program FSTAT version 2.9.3.253 to calculate Fsdiv, an index of unbiased gene diversity. Results of gene diversity remained similar when we excluded captive-bred individuals from our dataset (see Supplementary Material Table S1).

To examine whether the Clk locus is under selection we compared genetic differentiation at the Clk locus with presumably neutral, genome wide nucleotide diversity. Levels of genome wide nucleotide diversity were calculated as autosomal nucleotide diversity (pi) from publicly available whole genome re-sequencing data for five of the included stonechat populations (Kenya, Canary Islands, Austria, Ireland and Kazakhstan54). To identify if differences in gene diversity were due to differences in breeding latitude or migratory phenotype, we ran linear models predicting gene diversity by covariables latitude, migratory distance and pi.

Within individuals, we tested for an effect of latitude or migratory distance on Clk gene length by running linear mixed effects models with Clk mean repeat length as response variable, and either latitude or migratory distance as predictor variable as well as origin as a random effect.

Phenotypes of captive stonechats

Data on migratory phenotypes were previously presented for our four captive stonechat populations47. Briefly, stonechats were hand-raised in the Max Planck Institute for Ornithology, as offspring of captive or wild stonechats belonging to the Kenyan, Austria, Irish or Kazakh populations (for detail see47). The birds were kept indoors under simulated natural daylength changes in south Germany (47.5° N during the breeding season, and 40° N during winter50). Here we include all birds with available phenotype and genotype data (migration: n = 142 for spring and n = 188 for autumn; moult: n = 215 for onset, n = 209 for peak and n = 206 for end).

Postjuvenile moult in stonechats is the first moult after hatching and equates to the first prebasic moult80. A precise description of how moult was recorded was published previously72. In brief, immature stonechats were regularly checked for body moult in 19 plumage areas. The number of plumage areas moulting provided a moult score for each date to determine the timing of onset and completion of moult72.

We used nocturnal migratory restlessness behaviour as proxy for migratory timing. Migratory restlessness, the nocturnal activity of captive birds during migration periods, generally mirrors the migratory timing of free-living conspecifics81, although in stonechats and other songbird species, resident populations also show some migratory restlessness47. A detailed description of how migratory restlessness of the stonechats was recorded and characterised is published elsewhere47,50.

Analysis of genotype–phenotype association

To investigate correlation between Clk genotype and annual-cycle phenotype we ran linear mixed effects models including mean Clk repeat length, origin, sex and a two-way interaction of Clk repeat length with origin as fix effects and year and individual as random effects in our comparisons with migratory timing. For our analysis of postjuvenile moult we added hatch date and a two-way interaction of origin and hatch date as we expected a potential effect of hatch date on moult timing as it had been demonstrated in a previous study72. Random effects were not used in this analysis, since we only had values from 1 year.

We tested the genotype effect on timing of postjuvenile moult, spring and autumn migration, as response variables, using three different time-points, i.e. onset, peak and end (given as Julian date). Analysis was performed in RStudio interface of R version 3.3.382,83 using packages ‘lme4′84 and ‘sjPlot’85.

All trends remained unchanged in alternative models where one individual with a rare short poly-Q repeat length of Q8 was initially excluded. However, Q8 genotype for this individual (as classified by size fragment determination) was confirmed by Sanger sequencing, giving us no reason to exclude it from the analysis.