Introduction

Female ornamentation has long been interpreted as a by-product of a genetic correlation to a selected trait in males that has no function in females (Lande 1980). After studies could demonstrate that female ornaments can have a signalling function (Jones and Hunter 1993; Potti and Merino 1996; Amundsen et al. 1997; Amundsen and Forsgren 2001), it is now acknowledged that natural and/or sexual selection also acts directly on ornaments in females and can explain their development (Amundsen 2000; Clutton-Brock 2009). Selection processes for ornamentation in both males and females are mate choice and mate competition, competition over resources other than mates and selection for sexual mimicry (Kraaijeveld et al. 2007). Most studies on the signalling function of female ornaments have been carried out in species where the female ornament is a reduced form of a sexually selected trait in males, and ornamentation could be linked to female quality and fecundity, although results are mixed with some studies reporting no or negative relationships (reviewed in Amundsen 2000; Kraaijeveld et al. 2007; Clutton-Brock 2009).

There are substantially fewer studies on species with female-specific ornaments. In a number of non-avian species where females display ornaments that are not present in males, it has been demonstrated that these female-specific ornaments can have signalling function as well (Funk and Tallamy 2000; Amundsen and Forsgren 2001; Domb and Pagel 2001; Weiss 2006; Weiss et al. 2009). Besides the amount of literature on female ornaments in birds compared to other taxa, also in birds, most of the empirical work on female ornamentation has been carried out in species with similar ornaments in males and females (Amundsen 2000; Bennett and Owens 2002), maybe because the situation where both males and females are ornamented but have different ornaments appears to be rare. Heinsohn et al. (2005), for example, explored the reasons for a strikingly different plumage colouration in the Eclectus roratus parrot and found different selection pressures acting on males and females, attributed to sex-based differences in their lifestyle. However, their study does not link any of the plumage characters to measures of individual quality, and to our knowledge, no study so far investigated the potential of female-specific colouration as a quality signal in birds with conventional sex roles.

We studied female-specific colouration in a strongly sexually dimorphic species, the upland goose Chloephaga picta leucoptera. Upland geese are highly territorial and socially monogamous, usually returning to the same territory with the same mate every year (Summers and McAdam 1993). They belong to the order of the sheldgeese (Tadornini), a group that resembles true geese and shows similar habits but is more closely related to shelducks and ducks. Male upland geese have white head and breast feathers and black legs, whereas females have reddish-brown head and breast feathers and conspicuous yellow-orange legs (Fig. S1). Several studies on a variety of species have shown evidence that integument colouration (e.g. bill, skin) reliably signals individual quality or state (e.g. Negro et al. 1998; Faivre et al. 2003; Velando et al. 2005; e.g. Velando et al. 2006; Martinez-Padilla et al. 2007; Mougeot et al. 2007b, 2009). In contrast to plumage colouration, which may mostly reflect the conditions during moult and not necessarily condition at the time the study is conducted, the colour of skin or other integumentary parts reflects a more recent physiological state and hence is a more plastic indicator of current condition (Lozano 1994; Negro et al. 1998; Bortolotti et al. 2003; Martinez-Padilla et al. 2007). However, integument colouration may change rapidly (Faivre et al. 2003; Rosen and Tarvin 2006; Velando et al. 2006), and reliability can only be sustained if these changes reflect current state or the change of the state from one year to the next and if the relative state of an individual compared to others is stable (Greenfield and Rodriguez 2004; Senar and Quesada 2006; Perez-Rodriguez 2008). As Perez-Rodriguez (2008) could show for captive red-legged partridge Alectoris rufa, carotenoid-based integument colouration can be consistent between individuals both within and between years.

In this study, we examined whether the expression of female-specific colouration can reliably predict female phenotypic quality in the upland goose. Our study will thus be the first to test if female-specific colouration varies in expression and that this variation may have some signal value in a bird species with conventional sex roles. Furthermore, no study so far investigated the stability of integument colouration in consecutive years for wild birds under natural conditions. Our specific aims were as follows. We tested:

  1. 1.

    Whether female colouration was positively related to her reproductive investment, in terms of clutch and egg size. Egg size is an important component of parental effort in birds and several studies have shown a positive relationship between egg size and offspring fitness, in altricial as well as precocial birds. Larger eggs may enhance offspring fitness by increasing survival prospects in the first days after hatching and competitive power through a larger offspring size and the possession of more resources to survive adverse conditions (Ankney 1980; Amundsen and Stokland 1990; Magrath 1992; Potti 1999; Dawson and Clark 2000; Anderson and Alisauskas 2002; Christians 2002; Goth and Evans 2004; Rutkowska and Cichon 2005). In our system, we found that chicks hatched from nests with higher mean egg volume had higher early body condition (Gladbach et al. 2010).

  2. 2.

    Whether female colour was repeatable across years, an important factor in the reliability of a variable signal like integument colouration, which has never been investigated in the wild. Measures of individual quality need to be relatively consistent and repeatable over time in order to reflect the state of an individual over long time spans and reveal information about the inherent quality of individuals. To test this in the upland goose, we sampled individuals over two consecutive breeding seasons.

  3. 3.

    Whether variation in tarsus colour correlates with variation in plasma carotenoid levels both among and within individuals, indicating if changes in integument colouration may reflect individual differences and changes in carotenoid levels. In many species, the yellow, orange or red colour of integumentary parts results from the presence of carotenoids (McGraw 2004; Mougeot et al. 2007a; Juola et al. 2008; Perez-Rodriguez 2008). Carotenoids are natural pigments that cannot be synthesised by vertebrates and hence must be obtained via the diet (Fox 1979; Brush 1981; McGraw 2005). Therefore, the expression of carotenoid-based ornaments can indicate good foraging efficiency and condition (Endler 1983; Hill et al. 1994; Moller et al. 2000) or a high carotenoid availability in good territories. Carotenoids also have a range of health-related functions: they are known to work as antioxidants and immune enhancers (Lozano 1994; Olson and Owens 1998). This provides the basis for the honesty of carotenoid-derived signals because individuals may have to trade off the allocation of carotenoids to self-maintenance against ornamental colouration (Lozano 1994; von Schantz et al. 1996; Blas et al. 2006).

Materials and methods

Study site and study species

The study was carried out in the New Island Nature Reserve, Falkland Islands (51°43′S, 61°17′W) from October to December 2007 and 2008. The island has been established as a nature reserve in 1970 when all livestock was removed from the island. This led to an increase in the density of upland geese, which is now one of the highest in the Falkland Islands (Quillfeldt et al. 2005). The basic breeding biology and life cycle of upland geese has been studied in the Falklands from 1977 to 1980 (Summers 1983a). Take up of territories starts in August, egg laying commences in late September and most clutches are finished by the end of October. Hatching mainly takes place between mid-October and mid-November, and fledging starts when offspring are about 70 days old.

Field measurements and sampling

At the start of each field season, we mapped nests using Global Positioning System. For each nest, we determined clutch size, measured length (L, expressed in centimetres) and breadth (B, expressed in centimetres) of each egg to the nearest 0.1 mm using callipers and weighed each egg to the nearest 0.1 g using a digital balance. Egg volume (V, in cubic centimetres) was calculated as V = (L × B 2 × 0.507) following Furness and Furness (1981). We defined total clutch volume as the sum of the volumes of each egg in the clutch. We estimated expected hatching dates as described in Gladbach et al. (2010). We visited nests at least once a day, starting at the estimated hatching date; all eggs hatched within 0–2 days from the estimated hatch date.

We caught adults during the period when they attended their brood using a 3 × 5 m whoosh net. One person herded the family of geese slowly to the catching area, and when they arrived directly in front of the furled net, the other researcher pulled the trigger. Adults were marked with individual metal rings and weighed to the nearest 10 g using a digital spring balance. Head length, culmen length and tarsus length were measured to the nearest 0.1 mm using callipers; wing length (maximum flattened chord) was measured to the nearest 1 mm using a foot rule. A blood sample (approximately 300 μl) was collected from the brachial vein. Blood samples were kept cold (4°C) and centrifuged within 8 h. The separated plasma was stored at −20°C until carotenoid levels were determined (see below).

Chicks were weighed to the nearest 1 g using a spring balance (<300 g) or to the nearest 10 g using a digital spring balance (> 300 g). We measured head length, culmen length, wing length and tarsus length (±0.1 mm) using callipers to determine chick age from a growth curve for head and tarsus established from chicks of known age in 2005. For each clutch, the mean chick age and thereby mean hatching date were calculated. As not all territories were visited before hatching and thus hatching dates could not be determined from egg density for all individual females, we used this estimated hatching date for further analyses. Estimated hatching dates from egg measures and chick measures were highly correlated (r = 0.973, P < 0.001, N = 41).

The body condition of females was determined accounting for structural size and chick age, as we did not catch females during the prelaying period and captured females had goslings of different age (mean age, 11 ± 1 days) . Females lose weight during incubation and regain weight after hatching of the chicks (Summers and McAdam 1993). A principal components analysis extracted one principal component (PC1) with an eigenvalue of 1.794 from measurements of wing, head, bill and tarsus as an overall measure of body size. PC1 explained 44.844% of the variance. We estimated an expected body mass for each individual based on a multiple linear regression of body mass on the first principal component score (PC1) and chick age. Body condition was calculated as the ratio of the observed body mass to the derived expected body mass, according to female size and chick age.

Colour measurements

We measured the colour of the female head and tarsus based on digital photos. We used a Canon EOS 350D digital camera (8.0 megapixels) with a Canon Zoom Lens (EF-S 18–55 mm f/3.5–5.6) with automatic settings for integration time (shutter speed) and lens aperture, and with the white balance set to ‘daylight’. All photos were taken outdoors with the integrated flash lighting. We held the right tarsus next to a yellow reference card close to the ground and took two pictures from above (Fig. S2). Similarly, a photo of the head together with an orange reference card was taken from the left side (Fig. S3). We avoided direct sunlight on tarsus, head and reference card. Our method of colour measurement only includes the human-visible spectrum (approximately 400–700 nm) and excludes variation in the ultraviolet range, which is visible to the birds (e.g. Cuthill et al. 2000). However, the analysis of digital photos has proved to be a useful tool to study colouration in animals in general (Bergman and Beehner 2008) and also for the study of carotenoid based colouration in birds (e.g. Fitze et al. 2003; Perez-Rodriguez 2008; Mougeot et al. 2009).

Each digital photograph was imported into Adobe Photoshop CS3, and colour was measured as follows. First, we randomly selected five non-shaded areas of 100 × 100 pixels along the tarsus between the tibio-tarsal articulation and the foot and five non-shaded areas on the head in a circle around the eye and recorded the red, green and blue (RGB) levels using the histogram palette (averaged over the selected pixels) following Fitze and Richner (2002). Similarly, we measured five areas of the same size on the reference cards, and we calculated a mean value for tarsus, head and reference card. Digital images consist of a matrix of microscopic photocells where colour is recorded as brightness values (in the range, 0–255) of RGB. If any of the RGB values for any square was 255 (the upper limit of the camera), the photo was discarded (see Bergman and Beehner 2008). Because the actual value in each channel is only informative relative to the values in the other channels, we analysed the ratio of red to green (R/G) (Bergman and Beehner 2008). To account for differences of natural lighting conditions, we used the difference between the tarsus R/G value and the R/G value of the yellow reference card and the difference between the head R/G value and the R/G value of the orange reference card in each picture for further analysis and refer to this as ‘R/G’ for simplicity. R/G values of the two pictures taken in row were highly correlated (r = 0.88, n = 64, P < 0.001), and we used the mean value of the two pictures for further analysis.

To better describe what R/G difference values signify in terms of colour, we assessed the colour of our reference card using the GretagMacbeth ColorChecker colour rendition chart [product no. 50105, manufactured by Munsell Colour, division of GretagMacbeth; for a detailed description and picture of the chart see Bergman and Beehner (2008)]. We took 20 photos of the ColorChecker together with the yellow and orange reference card, respectively, and analysed them in Photoshop, as described above. We calculated the differences between the R/G values of the yellow card and the R/G values of the orange, orange-yellow and yellow square on the ColorChecker chart (R/G difference = value ColorChecker − value reference card). The difference increased from yellow (−0.335 ± 0.010), over orange-yellow (0.008 ± 0.015) to orange (0.860 ± 0.033); that is, higher tarsus R/G values indicate a more orange-like tarsus in our study. The same was done for the orange reference card, where we calculated the differences to the red, orange and orange-yellow square. Here, the difference increased from red (−2.750 ± 0.232) over orange (0.965 ± 0.091) to orange-yellow (1.683 ± 0.091); that is, lower head R/G values indicate a more red-like head colouration in our study.

Carotenoid analysis

We quantified plasma carotenoid levels following Alonso-Alvarez et al. (2004). We diluted 10 μl of plasma in 90 μl of absolute ethanol, vortexed the mixture and centrifuged it at 1500×g to precipitate flocculent proteins. The supernatant was examined in an Ultrospec 2000 (Pharmarcia Biotech) spectrophotometer, and the optical density at 450 nm (maximal absorbance of lutein) was determined. We calculated plasma carotenoid concentration (microgram per millilitre) using a standard curve of lutein (Sigma Chemicals). Each plasma sample was double tested, and the obtained values were highly correlated (r = 0.97, N = 83, P < 0.001). Alonso-Alvarez et al. (2004) assessed the reliability of this technique by comparing a high performance liquid chromatography analysis of plasma samples with colorimetric measurements. HPLC determined the presence of four carotenoids in the plasma (lutein, zeaxanthin, anhydrolutein and β-cryptoxanthin). The total amounts of carotenoids determined by the two techniques were highly correlated; colorimetric measurements could thus be considered as representative of total plasma carotenoid concentration.

Statistical analysis

Statistical tests were performed with SPSS 11.0. Normality was tested with Kolmogorov–Smirnov tests. Means are given with standard errors. We used Pearson correlations and stepwise linear regressions if the assumptions of normality and equal variances were met. Significance level was set at α = 0.05.

We captured and measured 29 females in 2007 and 38 in 2008, of which 20 females were measured in both seasons. To test which parameters (colouration, body condition, PC1 as a measure of body size and hatching date) have the biggest influence on clutch and egg size, we performed stepwise linear regressions, with probability of 0.05 for entry and 0.10 for removal. We only included the first measurement of each female in these analyses to avoid pseudoreplication. To test whether tarsus and head colouration could be predicted from body condition, body size and hatching date, we also performed stepwise linear regressions. As captured adult birds were of unknown age, individual age could not be included in the analyses. Sample sizes differed because of missing values. Some territories were not visited before chicks hatched, and hence, data on clutch size and egg volumes are missing. Furthermore, not all females were blood-sampled, and some photos were discarded (see above).

Results

The tarsus and head R/G values were similar for the two study years (Table 1). Tarsus and head R/G values were not correlated (Pearson correlations, r = 0.063, N = 40, P = 0.700).

Table 1 Mean values of plasma carotenoid concentrations and tarsus and head colouration of adult female upland geese in the wild: we tested for differences between the years using analysis of variance (see “Statistical analysis” section)

Tarsus colouration, reproductive parameters and body condition

Stepwise multiple regression suggested that clutch volume increased significantly with tarsus R/G values, with females with more orange-like tarsus colouration having laid clutches with a bigger volume, and was higher in individuals with earlier hatching dates (Table 2 and Fig. 1a). In addition, mean egg volume was higher in individuals with higher tarsus R/G values (Table 2 and Fig. 1a). Stepwise multiple regression found none of the possible explanatory variables body condition, body size and hatching date to be related to variation in tarsus R/G values.

Table 2 Results of stepwise multiple linear regression models for clutch and mean egg volume, including tarsus and head colouration of adult female upland geese as predictor variables (n.i. means that the variable was not included by the model as a significant predictor)
Fig. 1
figure 1

Relationship between a tarsus colour and b head colour (R/G values) and clutch volume (black dots, solid line) and mean egg volume (white dots, dashed line) in female upland geese on New Island. Please note the inverse scale for head R/G values. See text for statistical analysis

Head colouration, reproductive parameters and body condition

Stepwise multiple regression revealed that clutch volume was not related to head R/G values and body size but could be best predicted from a combination of hatching date and body condition (Table 2). Mean egg volumes were higher in females with a more red-like head colouration (Table 2 and Fig. 1b). Head R/G values were lower, i.e. more red-like, in females with a higher body condition (F 1,40 = 7.442, t = −2.728, P = 0.010, Fig. 2) but unrelated to body size and hatching date.

Fig. 2
figure 2

Relationship between head colour (R/G values) and body condition in female upland geese on New Island. Please note the inverse scale for head R/G values. See text for statistical analysis

Tarsus and head colouration and plasma carotenoid concentrations

The concentration of carotenoids circulating in the plasma ranged from 3.1 to 31.5 μg/ml and was similar for the two study years (Table 1). Plasma carotenoid concentrations were independent of capture date (r = −0.03, P = 0.78, N = 79).

Plasma carotenoid concentrations were higher in females with higher R/G values, i.e. a more orange-like tarsus colouration (Fig. 3, Pearson correlation: r = 0.336, P = 0.015, N = 52) but unrelated to head colouration (r = −0.091, P = 0.587, N = 38). Plasma carotenoid concentrations were also higher in females with higher mean egg volume but unrelated to clutch volume (Pearson correlations; mean egg volume, r = 0.50, P = 0.008, N = 27; clutch volume, r = 0.31, P = 0.120, N = 26; Fig. 4). The results remained qualitatively the same after controlling for hatching date (partial correlations; mean egg volume: r p  = 0.46, df = 23, P = 0.020; clutch volume: r p  = 0.22, df = 23, P = 0.290).

Fig. 3
figure 3

Relationship between plasma carotenoid concentrations and tarsus colour (R/G values) in female upland geese. See text for statistical analysis

Fig. 4
figure 4

Relationship between plasma carotenoid concentrations and clutch volume (black dots) and mean egg volume (white dots, dashed line) in female upland geese. See text for statistical analysis

Individual comparisons across years

Plasma carotenoid concentrations were significantly correlated between the two study years (Fig. 5a, r = 0.56, N = 20, P = 0.01). The intra-individual differences ranged from 0 to 12 μg/ml with a mean of 5.40 ± 0.7 μg/ml. In addition, the tarsus colour (R/G values) of individuals measured in 2007 and in 2008 was significantly correlated among years (Fig. 5b, r = 0.65, N = 15, P = 0.009), but head R/G values were not (r = 0.61, N = 8, P = 0.112). The individual changes in tarsus R/G values (value 2008 − value 2007) were significantly positively related to differences in carotenoid concentrations between the two study years (Fig. 6, Pearson correlation: r = 0.52, N = 15, P = 0.048).

Fig. 5
figure 5

Relationship between a plasma carotenoid concentrations and b tarsus colour (R/G values) of the same individual female upland geese measured in subsequent years

Fig. 6
figure 6

Relationship between the change in plasma carotenoid concentrations and the change in tarsus colour (R/G values) between two consecutive years (2007–2008) in female upland geese

Discussion

We found that female-specific colouration in upland geese can reliably indicate different aspects of female phenotypic quality. Females with more orange coloured legs and more red-like head colours had higher clutch and egg volumes than females with a paler leg and head colouration. Furthermore, individuals with a more reddish plumage colouration were in a better body condition. These relationships provide the theoretic possibility for males to assess female phenotypic quality on the basis of colouration. A male mated with a female with more orange coloured legs and more reddish head plumage can expect a higher reproductive output. To determine whether male mate choice is influenced by colouration, however, experiments have to be carried out. We also found that female tarsus colour was repeatable between years, indicating that tarsus colour is a reliable signal of long-term female quality in this species. At the same time, tarsus colouration was also a dynamic trait because the variability in tarsus colour reflected differences in plasma carotenoid levels both among and within individuals.

Egg size has been discussed in the context of individual quality before, with phenotypic and genetic factors predisposing some individuals to perform better under a given set of conditions and lay larger eggs (Reid and Boersma 1990; Croxall et al. 1992; Blackmer et al. 2005; Ardia et al. 2006). Especially in precocial species like waterfowl, female egg investment may play an important role in determining survival prospects of offspring (Amat et al. 2001) and thereby increasing individual fitness. Larger eggs may enhance offspring fitness by (a) increasing survival prospects in the first days after hatching and (b) increase competitive ability through larger offspring size and the possession of more resources to survive adverse conditions (Ankney 1980; Amundsen and Stokland 1990; Magrath 1992; Dawson and Clark 2000; Anderson and Alisauskas 2002; Christians 2002; Goth and Evans 2004; Rutkowska and Cichon 2005). In upland geese, body condition of young of age ≤20 days increased with increasing egg size Gladbach et al. (2010). The positive relationship between female colouration and egg and clutch size in our study system may therefore indicate individual reproductive quality. Furthermore, we found that female tarsus colouration is positively related to plasma carotenoid levels, suggesting that it is an honest indicator of the health state or of immunocompetence in female upland geese. This is in line with other studies showing that integument colouration reflected circulating carotenoid levels (Juola et al. 2008; Perez-Rodriguez 2008; Casagrande et al. 2009; Mougeot et al. 2009).

Only few studies have shown that a strongly environmentally determined signal like carotenoid-based integument colouration is consistent within individuals over time (Dawson and Bortolotti 2006; Perez-Rodriguez 2008). Such consistency is important in relation to male mate choice, especially in species that form life-long pair bonds such as the upland goose. If a sexual trait changes over time, the trait might be honest, but it would only reflect the current state and would have little predictive value for future reproductive success (Greenfield and Rodriguez 2004; Senar and Quesada 2006; Perez-Rodriguez 2008). Dawson and Bortolotti (2006) found in female American kestrels Falco sparverius that integument colour scores in the pre-laying period and during incubation were correlated. Perez-Rodriguez (2008) reported that in red-legged partridge A. rufa, bill and eye-ring colouration were highly repeatable within individuals, both within the breeding season and between two consecutive years. We show here that tarsus colouration of the same individuals in two consecutive years is strongly correlated. Measures of individual quality should be relatively consistent over longer time spans to be a reliable indicator for the inherent quality of individual birds (Ochs and Dawson 2008); female tarsus colouration thus could be used as such a quality signal. Monitoring over several years together with a complete data set for measures of phenotypic quality could shed further light on the usefulness of tarsus colouration as a measure of individual quality in female upland geese. We also found a strong correlation between years in individual plasma carotenoid concentrations. This contrasts with the study of Perez-Rodriguez (2008), where no such correlation was found. However, in that study, birds were kept in an aviary and fed ad libitum. Variability between individuals even when carotenoid access and health status are standardised indicates that intrinsic physiological or genetic factors may affect the ability to absorb, transport and transform carotenoids (McGraw and Hill 2001; Hadfield and Owens 2006). In our study, within-individual variation (maximum difference, 12 μg/ml) was smaller than between individual variation (maximum difference, 29 μg/ml). Despite relatively small intra-individual variation, the changes in plasma carotenoid levels were still reflected in individual variation in tarsus colouration between the 2 years. Our data thus support results from aviary experiments (Perez-Rodriguez 2008) and show that integument colouration is a dynamic trait remaining consistent between individuals also in a wild bird under natural conditions.

Territory quality and carotenoid availability in territories may be an important factor in determining the between-year correlation in plasma carotenoid levels. This is because upland geese are highly territorial and return to the same territories year after year. Measurements of tarsus colour and plasma carotenoid levels during the non-breeding season, when upland geese gather in large groups around ponds and thus share the same environmental conditions, could shed further light on the importance of the breeding territory for the expression of the ornament. Furthermore, we cannot exclude the hypothesis that females mated to high-quality males that defend high-quality territories have more orange-like tarsi because they would have access to carotenoid-rich food and are able to invest more in reproduction. This could be studied experimentally, by removing pairs and measuring changes in tarsus colour in relation to changes in territory quality. This would also require that at least territory quality and perhaps also male competitive ability in territory defence can be measured. Preliminary, we found no relationship between male body condition and female tarsus colouration and no repeatability of male plasma carotenoid concentrations between two consecutive years Gladbach et al. (2010), making this explanation perhaps less likely.

The reddish head colouration in upland geese may be based on melanins. Melanins are one of the most important groups of pigments involved in bird colouration (McGraw 2006) and responsible for reddish-brown, brown, black or grey colour of feathers. The effects of environmental and individual quality on melanin-dependent traits are still discussed (Griffith et al. 2006; Ducrest et al. 2008; McGraw 2008). Several studies have found melanin-based traits to be related to individual quality (reviewed in Jawor and Breitwisch 2003). Melanogenesis can be influenced by physiological condition with food availability and changing hormonal conditions affecting the biosynthesis and deposition of melanin into ornaments (Jawor and Breitwisch 2003), which could explain the more reddish head colouration of females with higher body condition. Alternatively, this may also be linked to a condition dependency of the moult of body feathers, with females in lower body condition skipping moult. The skipping of moult of primaries in sometimes three consecutive years has been reported in upland geese before (Summers 1983b), and the gradual impairment of flight feathers might be unimportant to this non-migratory species. This moult skipping could have the advantage to save energy and nutrients, which possibly also applies to body feathers. The colour of feathers might then be subject to degradation, leading to a paler colour of older feathers. This is supported by one case where we caught a female moulting head feathers (not included in the analyses) and the new feathers were of a darker reddish-brown colour than the old ones (Fig. S4). Another possibility is the occurrence of an age effect, with older females producing darker feathers. Both increases (e.g. Dale et al. 2002; e.g. Galvan and Moller 2009; Vergara et al. 2009) and decreases (e.g. Potti and Montalvo 1991; Siefferman et al. 2005) in the intensity of melanin-based colour with age have been reported before. To test this in upland geese, a comparison of individual feather colouration in several years or between individuals with known age is necessary. The non-significant correlation of head colour between the 2 years could be caused by the small sample size in our study (only pictures of eight females could be compared) because the high correlation coefficient (0.61) indicates that there is a trend also for head colouration to be stable between years. Plumage characters have been found to remain reasonably stable between moults in other studies (e.g. Senar and Quesada 2006). However, to confirm this in upland geese, the comparison of a larger number of individuals is necessary.

In summary, our study shows evidence that the yellow-orange tarsus and reddish-brown head in female upland geese, a colouration that only occurs in females, can indicate individual quality. Only a small number of previous studies reports positive relationships between female-specific ornamentation and aspects of phenotypic quality (Domb and Pagel 2001; Weiss 2006; Weiss et al. 2009). Jawor and Breitwisch (2003) hypothesise that birds displaying both carotenoid and melanin ornaments provide more reliable indication of their overall quality than birds displaying only one or the other type of ornament. We furthermore report for the first time in a wild bird under natural conditions that a female-specific ornament is stable across years. Following individuals for a longer period of time and including birds of known age (ringed as chicks) would further add to our knowledge about the development and reliability of quality signals in birds establishing long-term pair bonds.