Introduction

Many studies have shown that females have preferences for males with more elaborate secondary sexual traits such as more diverse songs (Gentner and Hulse 2000; Drăgănoiu et al. 2002) and larger sexual ornaments (Sheldon et al. 1997). Females may have preferences for those males with superior traits because males will provide either direct benefits such as parental care or indirect genetic benefits to the offspring (Andersson 1994).

However, female mating preferences may vary according to the context (Qvarnström 2001). Indeed, females have displayed differences in preference according to social context (such as presence or absence of competitors) (Callander et al. 2012), environmental conditions (Hale 2008), and timing of breeding (Qvarnström et al. 2000). This last factor can be particularly important for migratory birds that are constrained by their migration schedule. It is common that males arrive earlier than females (Møller 2004; Tottrup and Thorup 2008). Early arrival allows males to settle on the best territories (Aebischer et al. 1996) and also to obtain females more easily as females may also use territories as cues for mate selection (Alatalo et al. 1986). Among males arriving at the same time, those with brighter and/or larger plumage ornaments usually win competitions for territories (Pärt and Qvarnström 1997; Beck 2013). Therefore, females can choose these highly ornamented males to have access to necessary resources like nest sites or food. However, choosing dominant mates at the start of the season may also be costly as such males often try to attract a secondary or extra-pair female instead of caring for the primary nest (Qvarnström 1997, 1999). Consequently, it may be advantageous to only choose males with larger ornaments later in the season as their chances to find another mate are low at that time, and thus, they are expected to invest more in the feeding of nestlings (Qvarnström et al. 2000).

In addition to the choice of social mate, females might use several other mechanisms to increase their fitness. First, they may be unfaithful to their social mate. Extra-pair young obtained with another male of superior quality may be of higher quality compared to within-pair young (Akçay and Roughgarden 2007, but see Krist and Munclinger 2011). Many studies show that females cuckold their mates with older males (Cleasby and Nakagawa 2012) and more ornamented males (Kempenaers et al. 1992; Richardson and Burke 1999; Akçay and Roughgarden 2007), though the role of ornaments remains controversial (review in Akçay and Roughgarden 2007). In contrast to female preferences for social mates, which may be context-dependent due to the trade-off between the direct and indirect benefits of mate choice (Qvarnström 2001), preferences for extra-pair males with large ornaments may be consistent during the course of the season as these males can provide only genetic benefits, and thus, there is no trade-off with their paternal care even at the start of the season.

Extra-pair copulations are a prerequisite for another process that has come to the center of attention of ecologists in recent years. Whenever females copulate with more than one male, different ejaculates compete to fertilize the eggs, which is known as sperm competition. Many factors may influence the success of sperm in fertilizing eggs: the timing of copulation (Birkhead et al. 1989), the frequency of copulation (Møller and Birkhead 1993; Mougeot 2004), and sperm traits (Snook 2005). Among these sperm traits, viability (Smith 2012), speed of swimming, (Birkhead et al. 1999), number (Laskemoen et al. 2010), and size of the sperm (Lifjeld et al. 2010; Bennison et al. 2015) may modulate the success of egg fertilization.

Although it has previously been shown that male arrival date (Aebischer et al. 1996), secondary sexual ornaments (Sheldon and Ellegren 1999), and sperm size (Bennison et al. 2015) can have fitness effects, these factors were usually tested in isolation which complicates the evaluation of their relative importance. One remarkable exception is the study of Qvarnström et al. (2000) that tested how benefits of female choice of male ornaments depend on the time of male arrival to the breeding ground. However, this study did not take sperm competition pathways of sexual selection into account. Here, we tested the effects of male ornamentation, arrival time, and sperm morphology on their ability to sire offspring and gain fitness.

We studied these questions in the collared flycatcher (Ficedula albicollis), a migratory bird in which males arrive on the breeding grounds before females. Males of this species display two white patches, one on the forehead and the other on the wing, that have been found to be sexually selected in Swedish population (e.g., Sheldon and Ellegren 1999; de Heij et al. 2011). However, there may be differences in the strength of sexual selection between populations. For example, large forehead patch has been found to be preferred in extra-pair mates in the Swedish (Sheldon et al. 1997; Sheldon and Ellegren 1999) but not Hungarian (Rosivall et al. 2009) or Czech (Edme et al. 2016) populations. This calls for replicative research both within and between populations to test if the differences between studies really represent differences in the strength of selection between populations, which would have important consequences for the evolution of the species (see Scordato and Safran 2014) or if they are merely caused by sampling variance.

Methods

Study site and species

This study was carried out in an oak forest with approximately 350 nest boxes that are distributed among five study plots in Velky Kosir (49°32′N, 17°04′E) in Moravia, Czech Republic, from 2013 to 2015. Collared flycatchers (F. albicollis) are migratory passerine birds, and males arrive first (around mid-April) at the breeding site to obtain territories. Males are black with one white patch on the forehead and another on their wing feathers. Females selected their social mates based on both white ornaments in a Swedish population (Sheldon and Ellegren 1999; de Heij et al. 2011) and usually lay between four and eight eggs after pair bond formation. Chicks can hatch asynchronously as females start to incubate before the completion of the clutch. Both parents feed the chicks.

Adult measurements and forehead patch manipulation

In 2013, the first male arrived on April 15, and we started to trap males the following day. In total, we trapped males on 12 different days between April 16 and May 15. Each trapping day, we captured males in all empty or abandoned nest boxes with string nest box traps. We did not activate traps in nest boxes where nest material appeared unless these were apparently abandoned for several days (i.e., no progress in nest building). For individual males, we considered the first day of capture as their date of arrival. Our trapping scheme was highly efficient as the first day of capture was highly correlated (r = 0.96) with true arrival date as inferred from 16 males bearing geolocators in 2015 (M. Briedis et al., unpublished data). Immediately after each new male was captured, it was brought to the central site located among study plots. This transfer lasted up to half an hour.

At the central site, body mass, wing length, and tarsal length were measured. The wing patches were determined by summing the visible length of white patches on primaries 3 to 8 from the tip of the coverts to the distal part of the wing (in mm). All of these measurements were done by one person (MK). A blood sample was taken from the tarsal vein and stored in alcohol. A cloacal massage allowed us to obtain a sperm sample (see Quay 1986), which was stored in 4% formaldehyde. The age of males was determined by wing plumage as subadult males have brownish primaries. The forehead patch area was photographed two times before and another two times after the manipulation. The original patch size was computed as the mean of the two measurements before the manipulation delimited to the nearest 0.1 mm2 in ImageJ software.

We regularly rotated among three treatments: (1) we increased the height of white forehead patches by painting black feathers with a white marker (Alteco Paint Marker no.15). Using this technique, the size of the white patch was enlarged by ca. 50% (Table 1, Supplementary Online Material). We decided to use Alteco markers instead of Tippex used in former studies (e.g., Qvarnström et al. 2000) since they proved to be more durable during our pre-experimental manipulation done on caged zebra finches. Tippex usually started to erode within a week of manipulation, while Alteco still looked good after 7 days. Both Alteco and Tippex have similarly shaped reflectance curves that differ from natural white feathers. At low and high wavelengths, natural white reflects more than Tippex and especially Alteco (see Fig. 1). (2) Control birds were only measured and then released without any manipulation of the forehead patch. (3) We decreased the height of the white forehead patch to about half (Table 1, Supplementary Online Material) by painting it with a Copic 110 special black marker that has previously been used in flycatchers (de Heij et al. 2011). This manipulation resulted in naturally low reflectance (Fig. 1) but started to fade within a few days of manipulation. Our rotation scheme led to a random distribution of treatments among plots as indicated by a non-significant relationship between plot and frequency of treatment (χ 2 = 9.23, p = 0.324, df = 8, n = 73).

Table 1 Summary of means ± SD for different traits according to patch treatment
Fig. 1
figure 1

Reflectance of primaries of adult males before and after coloration with black or white markers. Five measurements were taken from the feathers of two males, and the lines are averages of these five measurements. The reflectance of primaries of adult males likely closely reflects that of their foreheads but the former was easier to measure on dead birds that were available before breeding season. These dead birds were killed by great tits that destroyed their foreheads

Because our manipulations were relatively short-term, they could mainly affect processes at the start of the breeding season like female choice of social partners, which usually takes place during the days after arrival to breeding sites (8 days for control males on average; see Table 1, Fig. 2). However, they might be less effective for female choice of extra-pair partners which might continue for a long time after males are socially mated, although most extra-pair copulations likely take place early in the female fertile period (Krist et al. 2005; Krist and Munclinger 2011) which peaks 2 days before laying of the first egg (Lifjeld et al. 1997). In the nests attended by our control males, laying started 6 days after social mating, i.e., 14 days after male arrival.

Fig. 2
figure 2

Relationship between arrival date and mating speed for the three treatments of forehead patch size. Control treatment, solid circles and solid line; decreased treatment, open circles and dotted line; enlarged treatment, triangles and dashed line

After manipulation, males were released on the same plot as they were caught. We caught the males a second time during the feeding period, and the same measurements were taken as well as blood and sperm samples. Females were also caught during the feeding period and were measured in the same way as males except for the forehead patch.

Monitoring of reproductive success

Nests were checked daily when the first egg was expected after nest building. Each egg was marked to obtain the laying order. The width and length were measured with digital calipers (±0.01 mm). The volume of the egg was calculated as volume = 0.51 × length × width2 (Hoyt 1979). When females ended the laying sequence and began incubation, we stopped the daily checks and started once again when the hatchlings were expected (around 10 days after the last egg was laid). A blood sample was obtained from chicks 6 days after hatching, and their fate was monitored until fledging. Unhatched eggs were collected 4 days after the last chicks hatched, and embryos were stored in ethanol, as were all of the other chicks found dead before day 6. Blood and tissue samples were used for paternity analyses.

In 2014 and 2015, we captured all of the males at arrival and both sexes during the breeding season, so we were able to count the number of recruits as all the chicks were ringed during the field season in 2013. We did not record whether those recruits bred during those 2 years, but only their survival since fledging. So our recruitment data concerned the number of chicks who survived and were able to come back to our field area. Despite natal fidelity being relatively high in our study area (Krist 2009), some individuals surely dispersed and thus our estimate of recruitment represents only the lower limit of the real value.

Genotyping and parentage assignment

DNA extraction was performed with DNeasy® Blood & Tissue Kit (Qiagen) for blood samples and tissue from dead embryos and chicks. All of the samples were genotyped at eight polymorphic microsatellite autosomal loci: Fhu2 (or PTC3) (Ellegren 1992), Cuμ04 (Gibbs et al. 1999), Fhy310, Fhy405, Fhy407, Fhy428, Fhy431, and Fhy452 (Leder et al. 2008). A single multiplex PCR using fluorescently labeled primers and a Type-it® Microsatellite PCR Kit (Qiagen) were used to amplify the microsatellites. The samples were treated with the following reaction conditions: 5 min at 95 °C, then in 30 cycles of 30 s at 95 °C, 90 s at 65 °C, 30 s at 72 °C, and finally 30 min at 60 °C. PCR products were mixed with GeneScan™-500 LIZ® Size Standard (Applied Biosystems) and analyzed with ABI PRISM® 3100 Genetic Analyzer (Applied Biosystems). GeneMarker® version 1.9 was used to score the genotypes, and locus characteristics based on allele frequencies were obtained with Cervus 3.0.3 (Kalinowski et al. 2007).

We obtained the genotypes of 262 adults (104 females and 158 males). For the first parent, the combined non-exclusion probability for that group of loci was found to be 7.03 × 10−4. We only considered the individuals that were genotyped at five loci or more for parental analysis. When female genotype was known, we compared it with its chick genotypes to check for egg dumping. One chick did not correspond to its social mother and was excluded. Secondly, when the social male was known, we compared the genotype of the male with the chicks he fed. If trio confidence (female-social male-chicks) based on Delta (difference in overall likelihood ratio scores between the most likely candidate parent and the second most likely candidate parent) and simulations of parentage was superior to 95%, we considered the chicks to be within-pair young. In cases where the mother was unknown, we took into account the duo confidence (male-chicks) with the same criterion. All chicks that were not assigned as within-pair young were classified as extra-pair young. Finally, we tried to determine the males who sired the extra-pair young. We selected all the males from the breeding season and compared their genotypes with the extra-pair chicks using the same criterion of 95% trio or duo confidence.

Sperm analyses

Two hundred and forty-two sperm samples were stored in 4% formaldehyde (152 from males at arrival and 90 during the feeding period) either at room temperature or at 8 °C in a refrigerator. We created slides for microscopy by spreading 7 μl of a sperm sample and letting it dry. The slide was then carefully rinsed with distilled water in order to remove dirt and salt crusts and air-dried again. For each sample, 20 pictures of morphologically normal-looking sperm were taken at ×400 magnification under light-field conditions using an Olympus CX41 microscope equipped with an Infinity 2 camera. If 20 sperms were not found on the first slide, a second slide was prepared. If after those two slides, no sperm at all was found, we did not prepare a third slide. For samples where the number of sperm was between 1 and 19 sperms after two slides, an ultimate slide was analyzed to complete the number of sperm pictures. We obtained 130 samples with the required number of sperm at arrival and 39 at feeding. Heads, mid-pieces and tails were measured (μm) in ImageJ software 1.49v (see Laskemoen et al. 2010). All of these measurements were done blindly by one person (PZ). Total sperm length was calculated by adding the three parts. Mean sperm length was calculated for each male.

Statistical analyses

All statistical analyses were conducted in RStudio, version 0.99.878 (R Core Team 2014), and we used the “lm” or “glm” function from the package “Stats” (R Core Team 2014). Since males were trapped at arrival (n = 153) and recaptured during the feeding period (n = 73), it was possible to identify those who were successful at pairing and establishing a nest in that particular season. To test this, we fitted a generalized linear model with a binomial link function (glm function from Stats package in R). The response variable was mating success (obtaining a nest: yes/no), and the predictors were the arrival date, original forehead patch size, wing patch size, the relative age (adult/subadult), and treatment (enlarged, decreased, and control). We also tested the interaction between arrival date and treatment, as the effect of ornament manipulation was found to be dependent on the time of the season in a previous study (Qvarnström et al. 2000). When this interaction was non-significant, it was removed from the final model. Continuous predictors in our models (i.e., male arrival and size of original forehead patch, wing patch, and sperm length) were not strongly intercorrelated (all r between −0.4 and +0.4, n = 63), indicating that multicollinearity was not a serious problem.

Another factor that we were interested in was mating speed. We calculated this as the time between male arrival date and the start of nest building by its social female. Six out of 73 males presented a negative value for the time lapse between those two dates, indicating that we trapped them well after their arrival. These males were trapped during the searching of secondary nest sites after they had started to breed in their primary nest box. We excluded them from all analyses. Five out of 73 breeding males were polygynous, and their secondary nests were not considered in analyses of mating speed, clutch size, and egg size. So in total, 67 manipulated and breeding males were used for most of the analyses. The mating speed ranged from 0 to 37 days (see also Table 1). A linear model was run, where the response variable was mating speed and the predictors were the same as in the model for mating success.

We also tested whether females changed their early reproductive effort in respect to male secondary sexual traits, as is predicted by the theory of reproductive allocation (Sheldon 2000; Horváthová et al. 2012). In the first model, we looked at the number of eggs the female laid. A linear model was run on the clutch size, as it had a better fit than the alternate Poisson model, and residuals from the linear model were normally distributed. The response variable was clutch size, and the predictors were the same as in the model for mating success. Second, we looked at the volume of the eggs; a linear model was run with mean egg volume as the response variable and the same predictors as in the model for mating success.

We added sperm length among predictors of the models testing for paternity success. We used the male sperm length measured at arrival. For five males, we obtained sperm only for the feeding period. As we had the mean size at arrival and feeding for 28 males, we calculated the difference between the mean sperm size at arrival (mean ± SD; 96.7 ± 3.20 μm) and the mean size at feeding (98.2 ± 2.30 μm) and subtracted this difference from the size at feeding for those five males without arrival data. In this way, we extrapolated the size of the sperm at arrival for those males. The results would be very similar if these males were excluded from the analyses. For another five males, we did not obtain enough sperm either at arrival or during breeding, and therefore, we excluded them from this analysis that was consequently based on 62 males.

The total paternity success of a male can be separated into two parts: the within-pair paternity in the social nest and the extra-pair paternity obtained in other nests. We first looked at the within-pair paternity with a generalized linear model with a quasi-binomial distribution and event/trial syntax for the response variable. In consequence, the response variable was the number of within-pair young (event) according to the clutch size (trial). In addition to predictors used in the model for mating success, we added mean sperm length and its quadratic term to all three models testing for paternity success. We added the quadratic term into models to test for the possibility of stabilizing selection on sperm size (Lifjeld et al. 2010). The extra-pair paternity was analyzed by a generalized linear model with a quasi-Poisson distribution. The response was the number of extra-pair young that males sired in all other nests in the nest box population (n = 119 nests with genotyped offspring). The predictors were the same as in the preceding model. As in all other models except of that for mating success, we tested only success of males breeding in our nest boxes. For five identified polygynous males, we included the number of young they sired in their secondary nests to their extra-pair success. This was done to be equivalent to cases where polygynous males were not identified at all as they did not feed their secondary nests. By this method, extra-pair success was overestimated while within-pair success was underestimated for polygynous males. Nevertheless, the results would be closely similar if five identified polygynous males were excluded from this model (results not shown). Moreover, this slight inadequacy did not affect the model of male total paternity because in this model the two paternity components were summed together. The model for total paternity was the same as for extra-pair paternity except for the response variable that was the total paternity. Our estimates of male extra-pair and therefore also total paternity success only reflect the lower limits of the real values since focal males might also sire offspring in natural cavities, i.e., outside our genotyped nest box population.

The number of fledglings and recruits is reflective of male fitness, so we ran two other models with the number of male genetic offspring that fledged as a response in the first model and the number of genetic offspring that were recruited (in 2014–2015) in the second model. For both models, the predictors were the same as in the model for mating success.

Results

During the arrival period, 160 males were trapped and 153 were involved in the patch manipulation experiment (52 increased, 51 decreased, and 50 for control). Seventy-three of them were recaptured when they were feeding chicks. Five of them were polygynous. We excluded secondary nests of polygynous males from analyses of mating speed, clutch size, and egg size. We also excluded six males that were caught a long time after their arrival (see “Methods” section). Consequently, our sample size for most analyses was 67 breeding males. In all models testing for paternity success, our sample size was reduced to 62 males due to missing sperm samples from 5 males. In these 62 nests, 286 within-pair young were sired by social and 67 by extra-pair mates. These 62 social males also sired 93 offspring outside their primary nests.

The males involved in our treatment arrived on average on 112.9 ± 6.0 (mean ± SD) Julian day (April 23) and required about 9.7 ± 6.6 days to pair (see Table 1 for more details). Females laid on average 6.06 ± 0.95 (mean ± SD) eggs, and the mean volume of the eggs was 1623 ± 130 mm3 (Table 1), with an average of 4.91 ± 2.60 chicks fledging from each nest (Table 1). We recaptured 83 of the nestlings in 2014 and 2015. The mean ± SD number of recruits per nest was 1.24 ± 1.26 (Table 1).

None of our main variables (arrival date, original size of male ornaments, and their experimental treatments) significantly affected male mating success (Table 2), although males with enlarged patches (21/52 = 40.4%) had a non-significantly lower mating success compared to the control group (25/50 = 50%) and males with reduced patches (27/51 = 52.9%). Similarly, males in the enlarged treatment had non-significantly lower mating speed than males in the other two treatments (Tables 1 and 2), and this seemed to be true mainly late in the season (Fig. 2), although the interaction between treatment and arrival date was marginally non-significant (p = 0.10). We did not find any evidence for female pre-hatching differential allocation since neither egg size nor clutch size differed between treatments (Tables 1 and 3). Male success in sperm competition was not affected by their arrival date, size of original forehead patch, experimental treatment, or sperm size (Table 4, Fig. 3). Finally, we also did not find a significant effect of any predictor on male fitness as determined by the number of fledglings and recruits, although males in the enlarged treatment had somewhat poorer performance compared to those in reduced and especially control treatments (Tables 1 and 5).

Table 2 Models for mating success (N = 153) and the speed of mating (N = 66)
Table 3 Models for clutch size (N = 67) and egg volume (N = 66)
Table 4 Models for within-pair, extra-pair, and total paternity (N = 62)
Fig. 3
figure 3

Relationship between sperm size and total number of sired offspring (total paternity). Solid circles depict males that had only one social nest (n = 57). Open circles depict polygynous males (n = 5). Fitted line shows predicted quadratic regression

Table 5 Models for number of fledglings and recruits (N = 67)

Discussion

We found several lines of evidence suggesting that males in the enlarged treatment of forehead patch size might have inferior breeding performance compared to males in control and reduced treatments. They had lower mating success, it took them longer to pair, especially late in the season, and their fitness as measured by clutch size and number of fledglings and recruits was also lower than in the other two treatment groups. However, although these effects were visible in the difference between means (Table 1), they were also highly variable, which caused them to be statistically non-significant, despite the fact that we involved the whole nest box population in our experiment and thus had a sample size comparable to many previous studies.

The manipulation of male attractiveness is a common type of experiment when studying mate choice, female investment, and paternity (Mazuc et al. 2003; Grana et al. 2012; Horváthová et al. 2012). Manipulations of ornaments in the collared flycatchers were previously done in the isolated Swedish population on the island of Gotland (Qvarnström 1999; Qvarnström et al. 2000; de Heij et al. 2011). Here, we partly replicated the forehead patch size manipulation from the Qvarnström et al. (2000) study in a Czech population of collared flycatchers. Qvarnström et al. (2000) found that females only preferred males with enlarged patches late in the season while having no strong preferences early in the season. We did not find a statistically significant interaction between pairing latency and experimental treatment of the forehead patch. If anything, there was an opposite tendency. Females in our population did not show any preferences early in the season but tended to prefer control males and males with decreased patch sizes over enlarged ones later in the season. There are several potential explanations for these different results.

First, it may be due to the type of white markers used in the experiment. We used a white paint marker (Alteco) while Qvarnström et al. (2000) used Tippex. However, this difference is unlikely to explain the opposite direction of our results as the shape of the reflectance curves of the two markers is very similar. In contrast, the shape of the reflectance curve of natural white is different from both artificial colorations (see Fig. 1). Consequently, it is possible that females can distinguish between natural and artificial white and consider only the natural one as attractive while the artificial one may be unattractive. If true, the different results could partially stem from a difference in the treatment of control groups. In our study, we did not color the control group at all, contrary to Qvarnström et al. (2000) who painted Tippex over the natural white in the same extent as was used to paint the enlarged patch group over their natural black. Consequently, females in our study might have perceived enlarged patch males as less attractive because they had the same extent of natural (and attractive) white as control males but, in addition, they had patches of artificial white that made them unattractive. In contrast, in the study of Qvarnström et al. (2000), both the enlarged patch and control groups had the same extent of artificial white, but the experimental group retained a larger extent of natural white making them more attractive.

Second, Qvarnström et al. (2000) kept the males caged for 1 day to break their dominance over their original territories. We released males immediately after patch size manipulation and thus allowed them to return to their territory without a need to fight for them once more. If nest-site competition was intense only late in the season due to the lack of unoccupied territories, then the pairing latency of large-patched males could be shorter only at the end of the season, as was found by Qvarnström et al. (2000), due to their ability to win the competition over territory (see Pärt and Qvarnström 1997). In contrast, pairing latency in our population should not be as strongly affected by male-male competition and thus directly represent female mate choice.

Finally, and most interestingly, differences in the role of ornaments in sexual selection may exist between populations (Scordato and Safran 2014). For example, it has been shown that forehead patch size is condition dependent (Gustafsson et al. 1995) and males with large forehead patches are preferred as social (Qvarnström et al. 2000) and extra-pair (Sheldon et al. 1997) partners in an isolated Swedish population. In contrast, wing patch size (Török et al. 2003) but not forehead patch size (Hegyi et al. 2002) is a condition-dependent signal important in male-male competition (Garamszegi et al. 2006) in a Hungarian population. Similarly to the Hungarian population, wing but not forehead patches played a role in extra-pair paternity in our Czech population (Edme et al. 2016). These similarities suggest a greater role of wing patches in Central Europe, the core of the distribution of the collared flycatcher. Nevertheless, females apparently paid attention to male foreheads in our population too, as they were less willing to mate with males with enlarged patches, and this was true especially late in the season. This change of mate preference with the season suggests an underlying change in costs and benefits of mating with large-patched males (Qvarnström 2001).

One explanation for plastic mate preferences may be the greater dependence of chicks on male paternal care late in the season. Consequently, females may be reluctant to pair with males that will not provide enough parental care during this difficult period of the breeding season. Highly ornamented males may invest resources into mating effort and provide less paternal care (Qvarnström 1997; Mazuc et al. 2003; Mitchell et al. 2007). Moreover, the size of the forehead patch may be used by females as an indicator of paternal care as this patch has been shown to decrease in the year following experimental increase of brood size (Gustafsson et al. 1995). Females living in populations with very limited resources may prefer males with smaller secondary sexual ornaments throughout the year (Griffith et al. 1999).

On the other hand, avoiding dominant males may also mean a loss on the side of indirect benefits if these males are genetically superior over subordinates. Therefore, females socially mated to high-quality fathers may increase the genetic component of offspring fitness by extra-pair copulation with superior males (Jennions and Petrie 2000). Extra-pair paternity is common in the collared flycatcher and is often related to secondary sexual plumage traits (Sheldon and Ellegren 1999; de Heij et al. 2011; Edme et al. 2016) as is also common in other species (Jennions and Petrie 2000; Akçay and Roughgarden 2007). However, extra-pair paternity is not determined solely by behavioral interactions among females and social and extra-pair males but also by the ability of sperm to fertilize ova, a process known as sperm competition. This area of research has been studied only recently and has yielded mixed results. Some studies have found a relationship between sperm traits and success in extra-pair paternity (Laskemoen et al. 2010; Bennison et al. 2015) while others have not supported this idea (Cramer et al. 2013).

Here, we found neither a linear nor a non-linear effect of sperm size on within-pair or extra-pair paternity. Thus, there was no evidence of either directional nor stabilizing selection on sperm size. Stabilizing selection for optimal sperm size is hypothesized to be linked to the intensity of sperm competition between species, with the strongest selection for optimal sperm phenotype in the most promiscuous species (Lifjeld et al. 2010). Sperm competition in our population is quite intense as roughly 20–25% of young are sired by extra-pair males (Krist et al. 2005; Krist and Munclinger 2011; this study). Therefore, at first sight, our results do not seem to support the hypothesis of Lifjeld et al. (2010). However, it is tremendously difficult to predict within-species effects from comparative studies. It could be that our population has already reached evolutionary equilibrium, when the sperm size of all males might be so close to the species’ optimum that any subtle differences in sperm morphology play no role in their fertilizing abilities. Moreover, other sperm traits that we did not measure might be more relevant for success in sperm competition, for instance sperm viability (Smith 2012), speed of swimming (Birkhead et al. 1999), and number of sperm cells in the ejaculate (Laskemoen et al. 2010).

We partially replicated the study of Qvarnström et al. (2000) that manipulated forehead patch size in the collared flycatcher. Contrary to the Swedish population, we did not find any evidence for female preference of males with enlarged patches late in the season. Males with artificially enlarged patches seemed to be unattractive in the Czech population, and this was especially true late in the season. We also did not find any evidence that sperm size affects within-pair or extra-pair paternity and consequently male fitness. These findings call for replicated research both in well-established fields like female mate choice with respect to male ornaments and emerging ones like sperm variation and its effect on paternity and fitness.