Geographic variation in birdsong can affect gene flow among populations and play an important role in avian speciation (Slabbekoorn and Smith 2002; Edwards et al. 2005; Price 2008). Song variation and associated acoustic response patterns by potential mates and competitors are well-accepted as descriptors of species boundaries and key factors in intra-specific population divergence. Therefore, understanding the causes and consequences of song divergence among populations of the same species has great potential to yield insight into the evolutionary emergence of restrictions on gene flow through song-dependent success in territory establishment and mate attraction (e.g. Irwin 2000; Danner et al. 2011; Kleindorfer et al. 2013; Greig et al. 2015).

Acoustic variation among different populations of the same bird species may arise through different processes (Ellers and Slabbekoorn 2003; Kroodsma 2004; Podos and Warren 2007). Dispersal to established or new populations may cause founder effects for vocal memes through drift and mutation processes (Lachlan et al. 2013). Many species have the ability of vocal learning (Slater 1986; Hultsch and Todt 2004; Beecher and Brenowitz 2005) and song sharing among neighbours and emergence of novel variation through copy errors may yield patterns of local convergence and geographic divergence (e.g. Ellers and Slabbekoorn 2003). Furthermore, song evolution may be shaped by morphological and physical constraints (Podos and Nowicki 2004) as well as environmental conditions (Boncoraglio and Saino 2007; Ey and Fischer 2009).

Migratory behaviour is another important ecological factor which has been related to patterns of geographic variation in birdsong and speciation (Mortega et al. 2014; Rolland et al. 2014). Some studies have suggested that migratory populations have evolved more divergent songs through improvisation and more elaborated songs because they are under stronger sexual selection due to a short breeding season (Catchpole 1982; Collins et al. 2009). Sedentary populations may also have more convergent songs through copying of neighbourhood songs, at least in some species, driven by the higher consistency in the identity of neighbouring males (Kroodsma 1999). Furthermore, variation in habitat density and species diversity may yield variable competition for acoustic space, and thereby allow song elaboration more easily at relatively high latitudes, where vegetation is typically less dense and acoustic competition with other species less intense (Weir and Wheatcroft 2011; Singh et al. 2015). Several comparative studies have confirmed positive correlations between song elaboration and latitude, both among (Catchpole 1982; Botero et al. 2009) and within species (Collins et al. 2009; Kaluthota et al. 2016). However, there are also studies with contrasting results (Byers 2011, 2015; Medina and Francis 2012) and more studies are needed for a better understanding.

Song variation linked to migratory tendency may affect the probability of gene flow between populations at different latitudes (Collins et al. 2009; Mortega et al. 2014; Gordinho et al. 2015). Populations breeding at different latitude may be spatially and temporally segregated in terms of vocal activity and perceptual windows for vocal learning. Consequently, a pattern of isolation by distance in geographic song variation is likely to emerge through both individual dispersal and song learning processes, which opens up potential for acoustically guided population divergence (Ellers and Slabbekoorn 2003; Podos and Warren 2007). This may be especially the case if such variation is linked to a heritable trait of ecological relevance such as migratory tendency (Winker 2000; Slabbekoorn and Smith 2002; Rolland et al. 2014). Species confined to fragmented habitats may be a special case in this respect as they occur in separated populations in habitat islands (c.f. Baker 1996; Slabbekoorn et al. 2003), which induces potentially discrete and concordant steps in variation for both acoustic traits and migratory tendency.

The marsh grassbird (Locustella pryeri Seebohm) provides a suitable case for investigating divergent and convergent influences on bird song variation. It is range limited to wetland habitats and categorized as ‘Near Threatened’ (BirdLife International 2012). Seven island populations have been identified for the sinensis subspecies in China (Fig. 1a), which is a partial migrant with birds from the three northern populations being migratory while the four more southern populations being sedentary (Li and Wang 2006; Zhang 2011; Hou 2014). Marsh grassbirds are typically cryptic, except for during the breeding season, from April to July (Madge 2006), when they exhibit a stereotypic song comprised of a repeating trill while perched, followed by a variable warble that is sung during a distinct flight display (Qu et al. 2011).

Fig. 1
figure 1

Four bioclimatic maps of the eastern and north-eastern part of China. a The map of China with the locations of all seven wetland localities of the marsh grassbird study populations. White dots of M1-3 indicate fully migratory populations in the north, and black dots of S4-7 indicate sedentary populations in the south. We provide information on 4 of the 19 bioclimatic variables (see Table 2) from the WorldClim database (, based on average monthly climate data collected between 1950 and 1990, interpolated globally at a resolution of 30 arc-seconds (Hijmans et al. 2005). b Annual mean temperature (bio 01) in °C*10 (The WorldClim database multiplies their °C values by 10 for computational reasons); c isothermality (bio 03), which is the mean diurnal range (bio 02) divided by the annual temperature range (bio 07); d annual precipitation (bio 12) in mm; and e precipitation seasonality (bio 15)

In this study, we provide a first description of geographic variation in the two structurally different parts of marsh grassbird song. We first quantified the structural differences in these two song parts, and further tested whether song variation was correlated with latitude and bioclimatic variables. We included rainfall and temperature, which are well-known to affect migratory behaviour (Salewski and Bruderer 2007; Louchart 2008; Zink 2011) and possibly song structure (e.g. Botero et al. 2009; Weir and Wheatcroft 2011; Medina and Francis 2012). We explored whether the acoustic variation within songs and across latitude and climate was in line with expectations for divergent functions of the different song parts. We expected to confirm anecdotal field observations suggesting that the opening trill would consist of more simple and repetitive elements relative to the variable warble. Furthermore, especially for the warble, we expected a pattern of song elaboration with latitude and stronger seasonal fluctuations in climatological conditions in the north, which are correlated with longer migratory distances, shorter breeding seasons and stronger sexual selection pressures.

Materials and methods

Song recording and measurements

We recorded songs during the breeding season (April–July) by multiple observers between the years of 2008 and 2014, from 6 to 11 A.M. (see Fig. 1; Table 1 for recording localities), using a Roland R-44 digital recorder (Roland China Limited, Beijing, China) and a Sennheiser MKH 416 directional microphone (Sennheiser Electronic Co. Ltd, Beijing, China) using a sample rate of 44.1 kHz. Male grassbirds are strongly territorial and show high local site fidelity, which makes it likely that recordings at different territories are also from different males (which were not banded). In a previous study, we have shown that each male grassbird has a unique song with a combination of many different element types that are sung in more or less repetitive series (Qu et al. 2011). Each song contains two distinct parts: a trill (referred to as part a in this study), consisting of a long series of elements sung in repetitive fashion, followed by a warble (referred to as part b), a long series of elements that are more complex and that are sung in highly variable sequences with only few repeats and typically different element types for every subsequent element in the string (Fig. 2). Observations in the field and video recordings indicate that the transition from simple and repetitive to more variable warble typically follows soon after shifting from perched song to a song display flight.

Table 1 Recording information of localities and individuals
Fig. 2
figure 2

Spectrograms of a typical song of the marsh grasswarbler Locustella pryeri sinensis showing the two song parts, how we assigned elements to different types and how the parameters were measured in this study

We chose about five songs of high recording quality from each individual for measurements and analyses. We processed the following three temporal parameters (Fig. 2): song part duration (D, time from start of the first element to end of the last element), number of element types (NT, variety in distinct element shape other than appearance shifts related to relative amplitude), and element rate (R, number of elements sung per second). We also processed three spectral parameters: maximum (Fmax, upper limit of the visible sound trace on the spectrogram of any song element within the target song part), minimum (Fmin, lower limit of the visible sound trace on the spectrogram of any song element within the target song part) and peak frequency (Fpeak, frequency of highest amplitude for the accumulated energy over all elements within the target song part). We measured the frequency parameters through visual inspection of sonograms, always generated using the very same settings in Avisoft. Although there are better-standardized ways to measure spectral parameters (see e.g. Zollinger et al. 2012; Cardoso and Atwell 2012), we are confident of consistent and objective cursor placement at upper and lower edges of sound traces attributed to birdsong for Fmax and Fmin. Furthermore, we believe there is no potential for measurement bias that can explain any of the patterns of spectral variation between song parts within songs or among songs of different sites.

All acoustic measurements were done using Avisoft-SAS Lab Pro 4.52 software (Avisoft Bioacoustics, Berlin, Germany), with a sampling rate of 22.05 kHz. We made spectrograms with a frequency resolution of 86 Hz and a temporal resolution of 1.45 ms (spectrogram settings: Window: Flat Top window, overlap: 75%, window length: 256). Spectrograms of different song recordings were normalized with respect to their maximum amplitude and all parameters were measured separately for both the trill and the warble part of the song (the footnotes a or b were used to label trill and warble, respectively).

Statistical analyses and bioclimatic variables

To examine the structural differences between parts a and b of the recorded songs (see Fig. 2), we first tested the measured acoustic parameters for normality using a Kolmogorov–Smirnov test and subsequently used either a paired T test or a Wilcoxon signed-ranks test to assess whether acoustic differences between the repeating trill part and variable warble part were significant (SPSS 16.0, SPSS Inc., Chicago, IL, USA). We then applied a linear mixed-effects model (R version 3.2.3, package lme4) to examine the correlation between song characteristics, latitude, and whether or not the population was migratory, in addition to 19 bioclimatic variables (Table 2) from the WorldClim database (, 2.5 decimal degree resolution).

Table 2 Descriptions of the WorldClim bioclimatic variables used in the mixed model analysis

The WorldClim database is widely used in species distribution modelling (De Clercq et al. 2015) and is composed of average monthly climate data collected between 1950 and 1990, interpolated globally at a resolution of 30 arc-seconds (Hijmans et al. 2005). We defined the random effect of our models as the population subgroup (random intercept, 7 groups) and the best fitting model for each song characteristic was selected using AIC stepwise selection (forward direction), where a model with an AIC score two points lower than another was considered a significantly better fit (Burnham and Anderson 2003). Song characteristics which expressed a skewed distribution were log-transformed to meet the assumptions of normally distributed residuals (Da, NTa, and Fmaxa, see also Table 3).

Table 3 Summarized results from the mixed model analysis of song characteristics. The best fitting model for each song characteristic was selected using AIC stepwise selection where a model with an AIC score two points lower than another was considered a significantly better fit

We then compared our final models to a latitude only model (where latitude was the only fixed effect) and null model (with no fixed effects), and used the AIC score to determine if the final models were significantly better fitting than the alternative latitude and null models. We also calculated the marginal R2 values for our models as described in Nakagawa et al. (2013) using the package MuMIn. Finally, a Spearman correlation matrix was calculated to examine the colinearity between the bioclimatic variables and latitude (see Appendix S1 in Supporting Information).


Comparisons between two different song parts

The trill part of the song was typically sung while perched and preceding a contiguous transition into the warble part in display flight. The two song parts differed significantly in the trill part being shorter (t = −4.11, P < 0.001), having fewer element types (t = −48.22, P < 0.001), a lower maximum frequency (t = −27.70, P < 0.001), a lower peak frequency (t = −19.50, P < 0.001), and a higher element rate (t = 4.40, P < 0.001) than the warble part (see Fig. 3 and Appendix S2). The minimum frequency exhibited no significant variation among the two song parts (Z = 91.00, NS).

Fig. 3
figure 3

Box-whisker plots of 6 acoustic parameters showing divergences among the 7 island populations of the marsh grassbird. Populations are listed from left to right with increasing latitude (from south to north). Mixed models were employed to examine the relationship between the acoustic parameters of each population and their associated latitude and bioclimatic values. A ‘B’ in the upper right corner of a plot indicates that the model using only bioclimatic variables as its fixed effects provided the best fit while an ‘L’ indicates that the model using latitude alone provided the best fit. Null models with no fixed effects were also tested, but did not provide the best fit for any of the parameters

Song differentiation with latitude and climate

We found several significant correlations between song variation, latitude, and bioclimatic variables (Figs. 1, 3). The results of the analyses are summarized in Appendix S2. When comparing our latitude models to our null models, the distribution of Fmaxa (Fig. 3e) across populations was explained best by the latitude model, with higher trill frequencies towards the North. Db (Fig. 3b) and NTb (Fig. 3d) were also explained best by the bioclimatic and latitude model respectively, with longer warble parts but fewer warble element types towards the North. The addition of a fixed effect classifying populations as either migratory or sedentary did not significantly improve the AIC score of any of the models predicting the spatial distribution of song characteristics.

While the models for all song characteristics which included the bioclimatic variables performed better than their null model equivalents, only the bioclimatic models for Da (Fig. 3a), Db (Fig. 3b), NTa (Fig. 3c), and Fmaxb (Fig. 3f) resulted in significantly better fitting models as compared to the alternative models using latitude alone as a fixed effect. According to the AIC stepwise selection, the selected predictors which resulted in the best fitting model for these four song parameters were, respectively: mean temperature of wettest quarter (bio_8), precipitation seasonality (bio_15, Fig. 1e) and the combination of annual precipitation (bio_12, Fig. 1d), precipitation of the wettest month (bio_13), and isothermality (bio_3, Fig. 1c; Table 3).


We successfully assessed geographic variation in two distinct parts of the marsh grassbird song in all seven populations known for the species in eastern and north-eastern China. The trill and warble parts differed significantly in several parameters and we found significant correlations between song variation in both the trill and the warble with latitude and bioclimatic variables. The trill was shorter than the warble, had far less variety in terms of the number of different element types, and covered a smaller frequency range, expressed in a lower maximum frequency. The trill increased in maximum frequency towards the north, while the warble increased in duration and decreased in the number of different element types with increasing latitude. Bioclimatic variables varied sufficiently independent from latitude to significantly improve the explanatory value of our statistical models.

Sexual selection and within-song divergence

We found large and consistent differences between the two parts of the marsh grassbird songs that are very much in line with the expectations for at least some functional subdivision. The trill part is shorter, less diverse and covers a more narrow frequency range than the warble part, which is in line with a functional bias towards male–male interactions for the repeating trill and towards male–female interactions for the variable warble (Catchpole 1982; Collins 2004). The trill part can also be sung by itself and seems the default, relatively low-effort advertisement signal while being perched somewhere within the territory boundaries. The fact that the transition towards the more complex part coincides with the energy-demanding ascending stage of the flight display also argues for display elaboration driven by sexual selection in the context of female mate attraction. However, it remains very difficult to conclusively attribute specific functions or targeted receivers to the different song parts as typically both song parts are present in most songs and also both sexes may be among the potential audience.

Changes in song use through the course of the season or after obtaining or losing a mate have often been taken as evidence for a functional subdivision of different song parts (e.g. Fessl and Hoi 1996; Beckett and Ritchison 2010). For example, blackcaps, Sylvia atricapilla, start their song with a complex warble part of wide bandwidth followed by a whistle part with louder and pure fluting tones of narrower bandwidth. The warble part becomes shorter as the breeding season progresses and is hardly sung at all after egg laying has finished (Collins et al. 2009). Furthermore, the whistle becomes more prominent in length relative to the warble during male–male interactions and playback (Leedale et al. 2015; Linossier et al. 2015). In the chaffinch, Fringilla coelebs, males start their song with a sequence of a few different trills followed by an unrepeated flourish. Indoor playbacks revealed that captive female chaffinches have a preference for a relatively longer flourish (Riebel and Slater 1998), while outdoor playbacks have shown that territorial males responded strongest when the trill part was relatively longer (Leitão and Riebel 2003).

Although these observational patterns of song use and response patterns to playback of variation are suggestive for a functional subdivision with respect to the dual function of song, the most convincing evidence for a sex-dependent message in two different song parts would be the induction of a shift in a singing male upon exposure to a male or female conspecific. For the marsh grassbird, we have no insights yet into seasonal changes or response patterns to playback. However, we report a consistent use of two song parts across seven populations and argue that it would be very interesting to investigate seasonal variation in the relative duration of both parts and their responses to the playback of these two parts.

Complex divergence patterns across latitude

We found several significant acoustic correlations with latitude; some of which as expected others hard to explain. The duration of the warble part was positively correlated with latitude, which is in line with expected elaboration in more seasonal and unpredictable habitat with presumable stronger sexual selection pressures (c.f. Catchpole 1982; Irwin 2000; Botero et al. 2009; Weir and Wheatcroft 2011). However, at the same time we found the number of different element types of the warble part to be negatively correlated with latitude. This was not congruent with elaboration in duration and not in line with the variation in selection pressures expected from the ecologically divergent conditions for the sedentary populations in the south and the migratory populations in the north. Consequently, the marsh grassbird contributes another case of contradictory patterns to the literature and confirms that there are no simple explanations or clear general patterns (c.f. Byers 2011, 2015; Medina and Francis 2012).

Although there is a clear lack of congruency between the geographic patterns in duration and element diversity, there are still strong patterns that call for an explanation. The fact that both correlations were found in the variable warble and not in the repeating trill confirms that different song parts with apparent functional differences can evolve more or less independently (c.f. Kroodsma 1981; Baker 2011). A recent study on different subspecies of the reed bunting (Emberiza schoeniclus) in Europe provided another similar case for a species of wetland island populations (Gordinho et al. 2015). In that study, latitudinal divergence in song complexity was also opposite of what would be expected.

However, follow-up studies are needed to gain more insights into functional differences between song parts and possibly divergent selection pressures on specific acoustic features. Investigations of correlations between parameters at the individual level and across the season may reveal internal coherence and functional significance. Playback experiments in marsh grassbirds, in which duration and element diversity are manipulated independently may be another tool that could yield critical insights for a better understanding of the interplay between sexual selection, acoustic and population divergence.

Song variation and climate

Our analyses of geographic variation of grassbird song also revealed a strong explanatory power for several bioclimatic variables. Latitude will partly explain climatological conditions, but they are modified by geographic factors such as, for example, proximity to the coast, altitudinal variation, prominent directions of ocean currents and wind. Furthermore, latitude likely has a more important additional value to just local climate for species that cover a large geographic area for dispersal and migration.

The impact of bioclimatic variables on birdsong has been reported before (Botero et al. 2009), and can be driven by effects of precipitation and temperature on vegetation, food availability and the consequences for sexual selection. Rainfall patterns and temperature fluctuations determine, for example, reed growth (Garris et al. 2015) and food availability (Gwitira et al. 2015) and thereby affect distribution, density, spring arrival dates and breeding success of reed bird communities (e.g. Virkkala et al. 2005; Halupka et al. 2008; Eglington et al. 2015).

Although a more variable and unpredictable climate has been related to more elaborate songs (Botero et al. 2009), it remains hard to directly explain why a variable such as isothermality matches strongly with maximum song frequency, as found for the warble part of the marsh grassbird. However, it should be no surprise that climate and latitude together may better predict variation in sexual selection pressures related to variable and unpredictable climate (Irwin 2000; Botero et al. 2009). The correlation between spectral bandwidth and latitude may also be attributed to habitat structure and community composition: selection against using higher frequencies may be relaxed because of more open vegetation and less competition for acoustic space due to fewer vocally active sympatric species in the north (Weir et al. 2012; Singh et al. 2015). Although community richness of vocal bird species and relative habitat openness in our seven populations may vary along the lines of these published studies, such measurements are not yet available and await future studies.

Propagation effects of vegetation on song transmission and the phenomenon of habitat-dependent song variation have been reported especially for forest habitat (Slabbekoorn 2004; Ey and Fischer 2009), but also reed stalks are known to filter selectively and provide better propagation conditions for low than for high song frequencies (Cosens and Falls 1984). An important fact may be that the warble is sung in flight and may be largely free of excess attenuation, but that the trill is predominantly sung perched and may have a transmission pathway to potential receivers through dense vegetation. However, currently we have no access to environmental data of the required detail for the marsh grassbird and further progress in our understanding depends on future sampling efforts.


Our acoustic description of vocal behaviour of marsh grassbirds across three migratory and four sedentary populations in China is in line with the expectations for a functional subdivision between the different parts of their songs. We also found interesting but not always congruent acoustic correlations with latitude and climate that may be related to variation in sexual selection pressures and migratory behaviour. However, a better understanding requires experimental studies into functional differentiation and possible seasonal and geographical variation as well as acoustic exploration of latitudinal variation in vegetation density and sound propagation. The discrete nature of the seven populations of this rare and endemic Chinese subspecies makes the marsh grassbird highly suitable for such studies addressing the interplay between birdsong, migration and speciation. Furthermore, conservation concern also makes them worthwhile to investigate as they feature on the IUCN red list and the kind of wetland areas they inhabit are globally threatened (Quesnelle et al. 2013).