Introduction

Sexual ornaments are typically under directional selection imposed by mating preferences (Hoekstra et al. 2001; Siepielski et al. 2009). This selection is in the direction of greater trait exaggeration which is in turn expected to give rise to increased production, maintenance, or wearing costs of ornamentation (Jennions et al. 2001). Mate choice also has costs, i.e. time, energy, and opportunity costs, compared with random mating (Cotton et al. 2006). To support continued mate choice, there need to be fitness benefits that counterbalance the costs of choice. Costs of ornament exaggeration select for a link between ornament expression and body condition because the proportional costs are greater for individuals in low condition (Grafen 1990). Condition-dependence of ornamentation plays fundamental roles in ensuring the fitness benefits of mate choice because it may link ornament expression to direct benefits of various sorts (avoidance of diseases, territory quality, parental contribution etc., Møller and Jennions 2001; Hegyi et al. 2015), but also to genetic benefits (Møller and Alatalo 1999; Prokop et al. 2012), for example via the additive genetic variance of condition (“genic capture”, Rowe and Houle 1996). It is therefore not surprising that condition-dependence has become a central concept in sexual selection research.

However, this concept also raises several questions. First, it is increasingly emphasized that a proportional increase of fitness costs with increased trait exaggeration is not necessary for honest condition-dependent advertisement (Számadó 2011). For example, there can be direct mechanistic links between body condition and ornament expression (e.g. physical trait deterioration, physiological pathways) that automatically ensure the honesty of the signal (Weaver et al. 2017). Second, interpreting experimental tests of condition-dependence can be difficult as nearly all of these also impose stress on the individual, and stress may have different effects on ornaments than a simple change in condition (Peters et al. 2011). In other words, a specific stress effect may lead to a drastic shift in the fitness consequences of poor condition when crossing some threshold in condition deterioration (e.g. entering a fasting state). It may also lead to specific effects of nutritional reserve depletion that are not detectable when looking at reserve state or reserve accumulation. Third, it has been hypothesized that the variability of resource acquisition compared with resource allocation may drastically shift the relationships between condition-dependent ornamentation and fitness-related traits. For example, if there is little among-individual variation in resource acquisition, positive correlations between costly ornamentation and fitness measures are unlikely (Morehouse 2014). Finally, in relatively static ornamental traits, correlation with condition has multiple interpretations. These include condition-dependence, i.e. effects from the trait-production period (Meadows et al. 2012) or from some resource-demanding period preceding ornament production (e.g. previous reproduction, Höglund and Sheldon 1998), but they also include the consequences of ornamentation on condition, e.g. wearing costs (Tibbetts 2014) or the facilitation of resource acquisition (Klug et al. 2010). Accordingly, progress towards better understanding the information content of ornamental traits would be facilitated by (1) comparing similar ornaments with slightly different honesty-enforcing mechanisms, (2) using separate measures for actual nutrient availability, nutrient acquisition rate, and nutritional stress, and (3) examining ornaments and multiple measures of condition in various contexts.

Here, we look at several possible links between nutrition and two different white plumage patches of collared flycatchers (Ficedula albicollis): one patch on the flight feathers, moulted in summer (wing patch, Török et al. 2003) and one patch on the contour feathers, moulted in winter (forehead patch, Hegyi et al. 2002). Both patches show within-individual changes (Hegyi et al. 2007) and are under sexual selection in our population (Hegyi et al. 2006, 2010). Wing patch size has been suggested to show greater condition-dependence than forehead patch size in our population (Hegyi et al. 2002; Török et al. 2003). We use three different measures of condition, including a measure of actual nutrient stores (residual body mass, Schulte-Hostedde et al. 2005; Labocha and Hayes 2012), a measure of recent lipid reserve accumulation rate, i.e. anabolism (plasma triglyceride level, hereafter TG, Jenni-Eiermann and Jenni 1994; Devost et al. 2014), and a measure of recent lipid reserve depletion (catabolism), a possible proxy for nutritional stress (plasma hydroxy-butyrate level, hereafter HBA, Jenni-Eiermann and Jenni 1994; Anteau and Afton 2008). A previous small sample trait-comparison study focused on metabolites, sexual signal expression, and pairing of courting males and indicated the potential utility of the metabolite-based nutritional indices in the context of sexual selection research (Hegyi et al. 2010). Here, we analyse condition measures and plumage patch sizes in three different contexts: ornamentation and current condition at courtship, ornamentation and current condition at nestling rearing, and change of ornamentation to the following year in relation to condition during the current breeding bout. Finally, we also assess the correlation structure of the three condition measures and its stability from courtship to nestling rearing. We have the following predictions.

  1. 1.

    Residual mass will positively correlate with ornament change to the next year, particularly for wing patch size (see Hegyi et al. 2002; Török et al. 2003).

  2. 2.

    Actual reserve accumulation will correlate (uncertain sign) with current wing patch size in the courtship phase when this patch has major role as a territorial signal (Garamszegi et al. 2006a) but such correlation is not expected for forehead patch size or in the nestling stage where male behaviour is not linked to patch sizes (Kiss et al. 2013; Laczi et al. 2017)

  3. 3.

    Reserve depletion will negatively correlate with current wing patch size during courtship if large-patched males suffer energetic shortcomings due to the territorial role of the trait (Garamszegi et al. 2006a, b; see also Kötél et al. 2016).

  4. 4.

    Reserve depletion will negatively correlate with ornament change to the next year if there are long-term consequences of brief nutritional stress episodes for the expression (wing and forehead patch) or the durability (forehead patch on the contour feathers) of the ornament produced at the next moult.

  5. 5.

    Correlations between current condition, reserve accumulation, and reserve depletion will be weak and stage-dependent, principally because current condition reflects the opposing effects of the other two variables.

Materials and methods

Field methods

The collared flycatcher is a socially monogamous, sexually dichromatic, insectivorous passerine bird breeding in tree holes. We conducted this study from 2014 to 2018 at our nest box plots (approx. 600 nestboxes hosting approx. 150–400 breeding pairs) maintained since the 1980s in the Pilis-Visegrádi Mountains, Hungary (47° 42′ N, 19° 01′ E). We caught courting males with spring traps at their nest box after their arrival from migration. We also followed breeding activity and caught parents feeding nestlings at 8–10 days of nestling age. We marked all birds with numbered metal rings and determined age as a binary variable (yearling or older) based on wing colour and wing patch size (yearlings: brown wing and small patch, older: dark wing, large patch). This categorization is hereafter mentioned as binary age. We measured body mass with a spring balance to the nearest 0.1 g. We used callipers to measure to the nearest 0.1 mm tarsus length, maximum forehead patch height and width, and wing patch length on primaries 4 to 8. We estimated forehead patch size as the product of width and height, and wing patch size as the sum of the measured lengths (for validation and repeatability, see Hegyi et al. 2002; Török et al. 2003).

Blood sampling and metabolite assays

In 2014–2017, we took blood samples from the brachial vein into heparinised capillaries, stored these in a cooling bag, and centrifuged them and separated the plasma on the day of collection. The plasma was stored at − 20 °C. Plasma samples were posted on dry ice to the Schweizerische Vogelwarte, Sempach and triglyceride (TG) beta-hydroxy-butyrate (HBA) levels were determined using two types of commercially available kits (2014–2015: TG Invicon Triglyceride HIT917, HBA Wako Autokit 3-hydroxybutyrate; 2016–2017: TG Cayman Chemical Art. No. 10010303, HBA Cayman Chemical Art. No. 700190). Kit type was included as a factor in the statistical analysis. Most samples (except for those with extremely small volume) were analysed in duplicates. The repeatability of the measurements is high (intraclass correlation ± SE; old kits, 0.806 ± 0.068 for HBA and 0.959 ± 0.062 for TG; new kits, 0.938 ± 0.034 for HBA and 0.940 ± 0.016 for TG) so we averaged the duplicate values for statistical analysis. Blood samples with small volumes and those from 2015 were not analysed for HBA, so sample size for this metabolite was lower than that for TG. To ensure comparability, here, we restrict the statistical analyses of TG to the HBA sample (thereby excluding 41 data points from 2015 and 30 other data points). Analysing the whole set of TG data leads to similar results as those reported here. Given that birds start feeding at sunrise, whereas our field protocols permit capture only from late morning onwards, i.e. several hours after sunrise, we are convinced that neither TG nor HBA values were influenced by overnight fasting at either courtship or nestling rearing (see also Hegyi et al. 2010). Blood sample processing was done blindly to the phenotypic data of the individuals.

Statistical methods

There was no evidence of significant among-year variation in residual mass or metabolite data (other than those attributable to kit type) in this dataset (pairwise difference between 2016 and 2017, Tukey HSD test, p = 0.686 for TG, p = 0.358 for HBA, other year pairs were processed with different assay kits and therefore cannot be compared), so it was unnecessary to enter year as a factor in addition to kit type in the following analyses. TG and HBA concentrations were log(10) transformed before analyses due to their right-tailed distributions. There were too few repeatedly sampled males for reliable statistical analysis of the repeats (males with both TG and HBA data; multiple years for the same stage: 9 individuals; both stages within a year: 3 individuals). We therefore kept one of the repeated data points for analysis. Among repeated data from the same male in different years, the first or the most complete data point was retained. Among repeated data from the same male within a single year, the data point at courtship was retained as this was the phase with the lowest sample size. This led to final sample sizes of 29 actually sampled courting males, 69 actually sampled nestling rearing males, and 35 (forehead patch size) and 37 (wing patch size) males for the analysis of within-individual change in patch sizes. Complementing the courting male dataset with data from earlier years (used in Hegyi et al. 2010) and thereby practically doubling its size, the results remain similar as those reported below. We do not include these data here because this would hamper comparability with data from the nestling rearing phase. We fitted general linear models in Statistica 5.5 (StatSoft, Inc) using backward simplification and reintroduction of the removed terms one by one (Hegyi and Laczi 2015).

We first assessed relationships between patch sizes and simultaneously taken condition measures in the courtship and the nestling rearing phases. Due to the much greater flexibility of condition measures than patch sizes in the short term, we treated patch sizes as the determinants of current condition and not vice versa. We therefore entered metabolite level or body mass as a dependent variable, phase (courtship versus nestling rearing) and kit type (see above) as factors, and forehead patch size and age-standardized wing patch size (Török et al. 2003) as continuous independent variables. We also entered the interactions between phase and patch sizes. Finally, we added tarsus length as a covariate in the case of body mass (correlation of body mass and tarsus length for the 98 data points used here, r = 0.233, p = 0.021).

We then looked at relationships between current condition and metabolites at nestling rearing and the subsequent change of patch sizes for the following year. We used binary age as a factor, raw or residual values of current condition, metabolite and first year patch size as independent variables, and patch size change (original subtracted from final value) as a dependent variable. Residual independent variables were calculated by correcting body mass for binary age and tarsus length (residual condition), metabolite levels for kit type, and first year wing patch size for binary age in general linear models. Values of first year forehead patch size were entered in the model without correction as this patch was not age-dependent in this dataset.

Finally, we used Pearson correlations to assess the interrelation of residual mass, reserve acquisition, and reserve depletion. Before correlating them, metabolite levels were corrected for kit type and body mass for tarsus length by calculating residuals from general linear models. Using these residuals, we then first assessed the concordance of the trait interrelation matrix between courtship and nestling rearing using CPC (Phillips and Arnold 1999). This program calculates the Akaike information criterion (AIC) as a measure of model suitability for models corresponding to different degrees of matrix concordance. The model assuming smallest concordance is called “unrelated” and implies that all PC axes of the two matrices are different in direction. The next level of concordance is “one common PC” where the first PC axis is similar in direction but the other axes are not. The highest levels of concordance are matrix proportionality (directions and relative eigenvalues of all PC axes are similar) and finally matrix equality (directions and absolute eigenvalues of all PC axes are similar). Based on the AIC values, we can choose the most suitable model as the simplest model within a given threshold AIC difference from the model with the lowest AIC (see further details in the “Results” section). We then based our final correlation tests on the result of this concordance test.

Data availability

The datasets analysed during the current study are available from the corresponding author on reasonable request.

Results

Condition at ornament measurement

For plasma TG level, there was an interaction between phase and wing patch size (Table 1). The effect of wing patch size was negative at courtship (F1,26 = 4.382, p = 0.046, effect size Pearson r = − 0.380, Fig. 1a) but not significant at nestling rearing (F1,64 = 0.001, p = 0.972, r = 0.004). For plasma HBA level, there was a tendency (p = 0.061) for a similar interaction between phase and wing patch size (Table 1) and the pattern was similar to TG (courtship, F1,26 = 3.751, p = 0.064, r = − 0.355, Fig. 1b; nestling rearing, F1,64 = 0.513, p = 0.477, r = 0.089). For body mass, no significant effect emerged except that of tarsus length (Table 1).

Table 1 Relationship between white patch sizes of collared flycatcher males and actual condition in the courtship and nestling rearing phases. General linear models with backward parameter removal and reintroduction
Fig. 1
figure 1

Relationships of the age-standardized wing patch size of collared flycatcher males with lipid metabolite levels in the courtship phase. a Triglyceride (TG). b Beta-hydroxy-butyrate (HBA)

Condition and subsequent change in ornament size

When looking at change in wing patch size, the only significant effects were age (yearling > older) and original patch size (negative), with no significant effect of previous condition measures (Table 2). Change in forehead patch size, on the other hand, was negatively related to original patch size and also negatively related to HBA level in the previous breeding season (Table 2, Fig. 2).

Table 2 Effects of condition measures in the nestling rearing phase on the changes of white patch sizes to the next year in collared flycatcher males. General linear models with backward parameter removal and reintroduction. TG and HBA levels were corrected for kit type while body mass was corrected for tarsus length
Fig. 2
figure 2

Relationship of plasma beta-hydroxy-butyrate (HBA) level in the nestling rearing period with the subsequent change of the forehead patch size (FPS) in collared flycatcher males. The plotted values were corrected for kit type (HBA) and for original patch size (FPS). The relationship changes very little when omitting the lowest value of patch size change (HBA effect after omission, F1,29 = 5.686, p = 0.024)

Interrelations of condition indices

Models assuming higher levels of matrix similarity received lower values of AIC and can therefore be tentatively considered as more and more appropriate (Table 3). However, the differences were small, likely due to the low average correlation between the examined traits (see below). If we rely on a permissive threshold of AIC difference of 2 (see, e.g. Symonds and Moussalli 2011 for detailed discussion), we can conclude that the two matrices are proportional, that is, all PC axes are similar in direction and relative importance between the courtship and nestling rearing phases. Based on this, we fitted a pooled correlation matrix for the two phases (98 data points). The pooling did not distort the correlations as none of the three metrics showed a significant breeding phase effect (p > 0.167). In the pooled data, there was no significant correlation between residual TG level and either residual mass (r = 0.006, p = 0.956) or residual HBA level (r = − 0.019, p = 0.853), whereas the correlation between residual mass and residual HBA level was moderately negative (r = − 0.310, p = 0.002, Fig. 3).

Table 3 Suitability values (AIC) for models corresponding to different degrees of similarity between the courtship and the nestling rearing phases in correlation matrices of condition measures in collared flycatcher males. Lower AIC values imply greater model suitability
Fig. 3
figure 3

Relationship between plasma beta-hydroxy-butyrate (HBA) level and residual body mass in collared flycatcher males. Data from courtship and nestling rearing were pooled. HBA level was corrected for kit type and body mass for tarsus length

Discussion

Here, we approached the topic of “condition and ornamentation” in a more comprehensive way than typical in the literature. For two white plumage traits, we estimated current nutritional reserve state, reserve accumulation, and reserve depletion in relation to current trait expression at both courtship and parental care, and the apparent consequences of the same nutritional measures for future change in trait expression. We also assessed the interrelations of these nutritional measures. For the focal ornaments, plasticity, function, and behavioural correlates are relatively well known from previous research in our population (see below), and here, we will interpret our findings in the context of these previous results.

We found that current ornament expression showed no relationship with current condition (residual mass) in the present dataset. The relative difficulty of detecting relations of current ornamentation with actual body condition may stem from the fact that reserve state regulation may change among reproductive stages (Moreno 1989; Jones 1994) and the nutritional consequences of ornament expression may also change among stages (see below). Stage-dependent allocation shifts may also dissociate residual mass from actual nutritional condition (Labocha and Hayes 2012). In contrast, resource acquisition rate and nutritional stress were related to current wing patch size, although only in the courtship period.

As TG and HBA levels are rapidly changing (Jenni-Eiermann and Jenni 1994), correlation with an ornament replaced once a year is expected only if the ornament very consistently predicts the dynamics of nutritional reserves. With respect to the courtship period, the roles and information content of forehead and wing patch sizes are different in the two intensively studied populations of collared flycatchers (island of Gotland and Pilis-Visegrádi Mountains). In Gotland, forehead patch size is condition dependent (Gustafsson et al. 1995) and it is the main trait involved in territorial competition (Pärt and Qvarnström 1997). In the Pilis, in contrast, forehead patch size is not condition dependent (Hegyi et al. 2002, 2007) and not involved in territorial competition (Garamszegi et al. 2006a), although it does seem to predict mate acquisition success (Hegyi et al. 2010). In line with this, we found no sign of a robust correlation between forehead patch size and actual nutritional indicators in our population.

In the Pilis population, male wing patch size is a phenotypically plastic trait (Török et al. 2003; Hegyi et al. 2007) and seems to be a principal determinant of territorial responses (Garamszegi et al. 2006a). Likely as a consequence of this, male wing patch size is spatially autocorrelated in our population, suggesting non-random territory acquisition in relation to the trait (Hegyi et al. 2008). In the Gotland population, experimentally enlarging the territorial badge (forehead patch size) in young males led to reduced territory acquisition success and reduced rates of feeding the nestlings (Qvarnström 1997; see also Kilpimaa et al. 2004; Mitchell et al. 2007). In line with this, we expected nutritional state in the courtship period to mainly reflect the temporally consistent costs of wearing the territorial badge (in this case, wing patch size, Garamszegi et al. 2006a). We indeed found a reduced rate of lipid reserve accumulation in large-patched males, but there was also a marginally non-significant tendency for less pronounced reserve depletion and no significant pattern with current residual mass. This may mirror the combination of the positive and negative aspects of large wing patch size. A large patch may increase costs from territorial disputes and may therefore lead to lower TG level (Garamszegi et al. 2006a), but it may at the same time mitigate the risk of serious resource shortages through potentially higher territory quality and thereby reduce HBA level (Hegyi et al. 2008). Interestingly, a previous study found a marginal negative relationship between a physiological stress measure (HSP60 level) and wing patch size in courting males of our population (Garamszegi et al. 2006b). The above patterns clearly indicate the benefit of using direct measures of recent nutrition when testing the effects of badge size on male performance. Despite early suggestions (e.g. Gustafsson et al. 1994), such measures have not spread in this field.

In the nestling rearing stage, no measure of actual nutrition showed any correlation with badge sizes. In our population, despite some consistent relationships between nestling growth and male badge sizes (Szöllősi et al. 2009; Hegyi et al. 2011), no correlational or experimental study has shown any robust link between measures of male parental care and forehead or wing patch size (Kiss et al. 2013; Kötél et al. 2016; Laczi et al. 2017). Therefore, nutritional reserve management in the nestling rearing period may indeed not be closely linked to ornamentation. However, condition during parental care and badge size can also be linked in the reverse causal direction, with reproductive effort affecting the future expression of badge size.

In our population, forehead patch size seems to have much lower sensitivity to environmental conditions than wing patch size (Hegyi et al. 2002, 2007; Török et al. 2003). Therefore, we expected an effect of previous condition (during nestling rearing) on wing but not forehead patch size, but found no such pattern for either patch. Our present results do not question the condition-dependence of wing patch size because our current estimate for size-corrected previous body mass does not differ significantly from a previous, significant estimate obtained on a larger dataset (Török et al. 2003). A more surprising pattern we found was the negative relationship of change in forehead patch size with a nutritional reserve depletion measure (HBA level) in the previous nestling rearing period, but not with previous residual mass or previous reserve accumulation.

White reflectance is due to the quantitative aspects of the feather keratin matrix (Prum 2006; Igic et al. 2018) and these same aspects may influence the mechanical strength of the feather. It has previously been suggested that white areas on feathers can be uncheatable “revealing indicators” of feather macrostructural strength as these areas are particularly prone to mechanical damage and parasitic consumption (Kose and Møller 1999; Ferns and Hinsley 2004). Nutritional stress during moult can lead to poorer mechanical strength of ornamental feathers (Andersson 1994). White ornament abrasion may therefore represent a pathway to indicate individual quality (Griggio et al. 2011). The forehead patch consists of some contour feathers of the forehead, and it is known that this patch indeed decreases in size due to abrasion of the feather tips, even across the same breeding season (Griffith and Sheldon 2001; our unpublished data). We suggest that severe nutritional reserve depletion during the generally stressful period of nestling rearing may have impinged on physiological state during the next winter moult when the forehead patch was formed. This could have resulted in either (1) smaller post-moult patch sizes than the original (i.e. a specific stress effect on structural ornament expression; Peters et al. 2011) or (2) similar post-moult patch sizes but lower abrasion-resistance of the new patch (i.e. stress effect on structural ornament durability; Griggio et al. 2011). Further investigations are needed to determine the mechanism of forehead patch reduction after nutritional stress.

The classical theory of life histories makes crucial predictions concerning the interrelations of costly, fitness-related traits. If the resource pool individuals can rely on is similar, individual variation will be restricted to resource allocation, and among-individual correlations of fitness-related traits will tend to be negative. If, on the other hand, individuals vary in their overall resource pool available for allocation, among-individual trait correlations may be positive (van Noordwijk and de Jong 1986). It has recently been suggested that this has high relevance for sexual trait information content. If variability in resource acquisition dominates, condition-dependent ornaments are expected to positively correlate with other fitness-related traits, while a negative correlation is expected when variability in resource allocation dominates. In other words, condition-dependent ornaments can be good fitness indicators only if there is consistent variation among individuals in the overall resource pool they possess (Morehouse 2014).

We may consider residual mass as an index of resource pool availability for allocation while TG level as a measure of recent resource acquisition. Our present data indicate that the dynamics of resource stores and the actual state of these stores may vary rather differently. Although recent reserve depletion was visible on the current condition of the individual, there was little relationship between recent reserve accumulation and condition. This suggests that it may not be easy to trace the resource pool available for allocation to ornaments and other functions. Furthermore, it has been noted that positive correlations may arise between ornaments and life history traits even in the absence of systematic differences in total resource pool if resource utilization efficiency shows individual variation (Olijnyk and Nelson 2013). Moreover, the whole situation becomes much more complicated if allocation rules systematically change with resource acquisition (Robinson and Beckerman 2013; Descamps et al. 2016). Evidence is indeed accumulating that allocation to ornament development may depend on condition, resource stress, or physiological state (Badyaev and Vleck 2007; Lindström et al. 2009; Schwab and Moczek 2016). Therefore, it seems conceptually important to approach the relationships of ornament expression and condition in more detail than it is currently the norm.

To summarize, there was a lack of clear relationship in this dataset between current body condition and consequent changes of a dynamic plumage ornament (wing patch size, Török et al. 2003), but a robust relationship between a specific measure of recent nutritional reserve depletion and the observed future expression of a less dynamic ornament (forehead patch size, Hegyi et al. 2006). Moreover, in the principal phase of using the wing patch as a territorial badge (courtship period, Garamszegi et al. 2006a), size of the badge was related to reserve accumulation and tended to correlate with reserve depletion, but it was unrelated to current condition. Finally, no measure of nutrition correlated with ornament sizes in the nestling rearing phase when ornamentation does not seem to predict male behaviour. These results indicate that the increasingly realized benefits of assessing physiological state in studies of condition-dependence in a dynamic and multivariate manner (e.g. Milot et al. 2014; Hennin et al. 2018) may extend to sexual ornaments, especially if reserve dynamics are measured in multiple phases of the yearly cycle. Finally, our findings support the opinions that future research on the condition-dependent regulation of sexual trait expression will benefit from considering the specific roles of stress (Buchanan 2011; Moore et al. 2016) and the regulation of resource acquisition (Morehouse 2014).