Intraspecific variation in performance on cognitive tasks is ubiquitous. Individuals within species and populations vary in how quickly they learn, the type of information they learn, how much they learn, how long they remember learned information, how much information they remember, and more (Boogert et al., 2018). This variation is key in understanding cognitive evolution because natural selection acts upon individual differences, and so investigating the evolution of cognitive abilities requires identifying cognitive traits (e.g., potentially learning speed; see below) that vary among individuals, are repeatable, heritable, and affect fitness (Thornton et al., 2014). Identifying cognitive traits that are available for selection further allows for examining the extent to which different cognitive traits are related or not, and the existence of potential tradeoffs between cognitive traits. However, relatively little is known about what cognitive traits are available to selection and how different traits relate to each other.

Investigating individual variation in cognition is key to identifying specific cognitive traits and cognitive domains—modules that process specific types of information for specific functions (Barrett & Kurzban, 2006; Sherry & Schacter, 1987; Shettleworth, 2012)—the extent to which domains and traits are related or not, and in turn, how selection can act and has acted upon these cognitive domains and traits (Boogert et al., 2018; Healy et al., 2009; Sonnenberg et al., 2019; Völter et al., 2018). Cognitive domains may involve both specific sensory modalities through which information is gathered (e.g., visual, olfactory) and specific behavioural systems related to the ecological relevance of these stimuli (e.g., reproduction, foraging; Hogan, 1988). Individual variation in cognitive performance measures such as learning speed, learning accuracy, or length of memory may or may not represent valid constructs (i.e., cognitive traits) that are repeatable and reflect underlying neural processes (Healy et al., 2009; Healy & Rowe, 2014). Identifying cognitive traits is essential as there is substantial noise in performance on many cognitive tasks and if a specific performance measure is not repeatable, it begs the question of whether it is a useful construct and if it is effectively measuring a true cognitive trait. For example, does an individual who learns quickly on one task do so on a different task? Examining individual variation can reveal repeatable cognitive traits and clarify what cognitive traits are available to selection pressures, as well as potentially identify why individual variation is maintained in a population.

Human research is especially illuminating regarding cognitive domains and individual variation, as the most intensive investigations into individual differences and cognitive domains have been conducted in humans (Carroll & Maxwell, 1979; Wasserman, 2012). Decades of research into human cognition has identified that around 50% of variance in performance across a wide range of cognitive tasks is explained by a “general intelligence factor,” or “g,” indicating that common underlying mechanisms are used in a variety of seemingly distinct cognitive tasks (Deary, 2001; Flaim & Blaisdell, 2020). Examining individual variation across cognitive tasks also typically reveals separate group factors, or cognitive domains in addition to g; these specific factors vary slightly among studies and all correlate with g, but typical group factors include verbal comprehension, perceptual organization, spatial memory, working memory, and processing speed (Deary, 2001). Research into human individual differences in cognition thus offers one starting point and comparison in efforts to understand the structure and domains of cognition in other species.

Just as in humans, investigating individual differences in animals can reveal the domains of cognition that selection can act upon and help determine why performance across different tasks might correlate or not. One of the central questions of comparative cognition is the extent to which cognitive mechanisms are domain-specific and adaptive to very specific situations, versus domain-general and adaptive to a variety of contexts or easily exapted to other contexts (Boogert et al., 2018; Burkart & van Schaik, 2016; Chiappe & MacDonald, 2005; Huber, 2017; Kanazawa, 2004; Macphail & Bolhuis, 2001). Examining individual variation allows for parsing out the extent to which performance in one domain corresponds with (or does not) another. While human research supports the presence of g, much less empirical work has been undertaken in animals.

Studies in non-human species that examine performance across a battery of cognitive tests have found a primary factor accounting for 28% of performance variation in mice (Sauce et al., 2018), 34% in toutouwai (Petroica longipes; Shaw et al., 2015), 64% in Australian magpies (Ashton et al., 2018) and values within a similar range in other studies and species (for reviews, see Burkart & van Schaik, 2016; Flaim & Blaisdell, 2020). However, the only meta-analysis on this topic found that correlation between performance of two different cognitive tasks was low (r = .19) but significant, and only a few species have been examined (Huber, 2017; Poirier et al., 2020). Additionally, strong factor loadings and correlations in performance across individuals can be due to other noncognitive variables such as motivation and do not necessarily indicate the existence of g (Shuker et al., 2017), especially when repeatability on a given task is low as often seems the case in cognitive research (Cauchoix et al., 2018; Dingemanse & Dochtermann, 2013; Poirier et al., 2020).

As it is, the presence of g outside of humans is uncertain, controversial, and often counter to the driving hypotheses of much of comparative cognition and evolutionary psychology research (Chiappe & MacDonald, 2005; Kanazawa, 2004). Some prominent hypotheses of cognitive evolution often (though of course not always) operate under assumptions that specific evolutionary forces favors greater general domain-general intelligence or “enhanced” cognition across a variety of domains rather than domain-specific adaptations. For example, the hypotheses that living in larger group sizes or having more social interactions, (Ashton et al., 2020; Dunbar, 1998, 2009), living in harsher or more variable environments (Hermer et al., 2018; Sol, 2009; Sol et al., 2005; but see Roth et al., 2010 ) or having more variable and patchy foraging ecology (Henke-von der Malsburg et al., 2020; Rosati, 2017) leads to improved cognitive abilities broadly. Much comparative research, however, often takes an adaptive approach to cognition, hypothesizing that specific cognitive abilities are adapted for specific ecological needs, and this cognitive adaptation results in clear modularity in cognition and neural structure (Shettleworth, 2012). The evidence is clear that many cognitive abilities are domain-specific and modular to at least some extent, including spatial learning, vocal learning, or food-aversion learning (Shettleworth, 2012). Domjan’s research has been crucial in recognizing that learning and cognition are not simply domain-general mechanisms that operate the same across different stimuli but rather are often adapted to specific contexts and behaviours pertaining to a species’ ecology (Domjan, 1983). Ultimately, examining individual differences can provide key insights into the extent of domain-general versus domain-specific cognitive abilities.

Zebra finches have long been used in cognitive research, especially in the investigation of the cognitive mechanisms and underlying neurobiology of song learning. This research has identified many adaptations for song learning in this species, identifying clear domain-specific abilities (Brainard & Doupe, 2002; Zann, 1996). More recently, ours and other laboratories have examined physical cognitive abilities in zebra finches (Bailey et al., 2014; Lambert et al., 2021; Muth & Healy, 2014). Physical cognition—how animals process information regarding the physical forces and structural properties of the environment (Auersperg et al., 2017)—involves using visual and sensorimotor stimuli to make decisions and may or may not rely upon domain-specific cognitive mechanisms (Lambert et al., 2021). Zebra finches build domed nests out of grasses and twigs in the wild that are typical of estrildine finches (Zann, 1996), the male is the primary builder, and zebra finches readily build in captivity using a variety of materials (Bailey et al., 2014). Zebra finches clearly learn from their building experiences (Breen et al., 2020; Camacho-Alpízar et al., 2021a; Muth & Healy, 2011; Sargent, 1965), indicating they acquire information about nest building—including the structural properties of materials they use—that influences their future behaviours.

It is unknown to what extent zebra finch performance in acoustic discrimination tasks and physical discrimination tasks represent adaptive domain-specific cognitive abilities or rather involve domain-general abilities or cognitive abilities that may have evolved for other purposes (Shettleworth, 2012; Taylor & Gray, 2014; Teschke et al., 2013). In the current study, we examined individual learning and learning generalization of zebra finches across eight different tasks from two different experimental paradigms: an in-cage foraging board and an acoustic operant box. Our foraging board tasks involved visual and sensorimotor information, with string length and flexibility both structural properties relevant to zebra finch nest building (Bailey et al., 2014; Lambert et al., 2021; Muth & Healy, 2014). The tasks in the acoustic operant box involved visual and auditory stimuli, with the auditory stimuli in the auditory discrimination being social information (i.e., zebra finch call playbacks). As such, our two experiments included both different types of sensory information and different ecological relevance or relation to different behavioural systems (Domjan, 1983; Hogan, 1988). We sought to analyze individual differences across these different tasks in order to determine (1) the extent to which there is evidence of general learning ability across the different tasks, (2) the extent to which there is evidence of domain-specific learning abilities, and (3) the extent to which learning speed is related to transfer of learning (generalization).



We used 34 zebra finches (18 females, 16 males, all 300+ days old at the start of the experiment) that were bred and raised at the University of Alberta. Every bird was bred in our laboratory in King Cages (50 × 100 × 50 cm; King Cages International LLC) and subsequently separated from their parents at nutritional independence (~35 days posthatch) into another King Cage that held only juvenile birds. Once the sexes of the juvenile birds were visually distinguishable (~35–45 days posthatch), they were moved into colony cages that held other same-sex birds (165 × 66 × 184 cm). Each bird then went through another experiment after they had reached sexual maturity (~90 days posthatch) wherein nonsibling male–female pairs first built a partial nest using coloured twine (Baker’s Twine, James Lever Co., London, UK; Camacho-Alpízar et al., 2021b), and then built a full nest using coconut fiber (Aves Canada); 26/34 birds successfully bred in these coconut fiber nests, while the other eight birds had failed nests or disrupted nests due to the COVID-19 pandemic research shut-down. Throughout all of the aforementioned housing, birds were provided ad libitum access to mixed seed (Hagen Canada, Quebec, Canada), gravel (Hartz, Ontario, Canada), oyster shell (Canadian Lab Diets, Inc.), cuttlefish bone (Canadian Lab Diets, Inc., Alberta, Canada), and water, on a 14:10 light:dark cycle (full spectrum lights—Standard, 32W, T8 Daylight). Birds were supplemented with spinach and Prime Vitamin Supplement (Hagen) three times a week, spray millet once per week, and daily egg mix (CeDe-Finches) during breeding. All experimental procedures and husbandry were approved by University of Alberta Animal Care and Use Committee (AUP 00002923/0001937).

Each bird was tested in the foraging board experiment (Lambert et al., 2021) for 13–40 days (mean ± standard deviation [SD]: 22.82 ± 6.19) depending on how long it took each bird to learn. Each bird was then returned to the colony rooms until starting the acoustic operant experiment, and the interval between finishing the first experiment and starting the second experiment ranged from 13 to 216 days (mean ± SD: 109.41 ± 60.95); this interval was not related to performance on the second experiment (see supplementary information). For the second experiment (Sahu et al., 2022) birds were placed in an acoustic operant chamber and remained there until passing or failing the experiment (range: 7–155 days; mean ± SD: 47.36 ± 32.46). Each bird was tested on the tasks in the foraging board experiment and the operant chamber experiment in the same order (Table 1)—this ensures that birds’ experiences are comparable and that among-individual variation is not influenced by differential order of tasks.

Table 1 List of the different tasks and what is being learned

Foraging board experiment

Each bird was first tested in the foraging board tasks. The foraging board tasks were a part of another experiment examining potential sex differences between male and female zebra finches in their ability to discriminate between physical properties of materials; for further details on these methods, see Lambert et al. (2021); the methods here are presented in brief.


Birds were housed in same-sex pairs in King Cages (Fig. 1) that had two mini BNC cameras (OSY CAMS) attached inside the cage, and each camera recorded its respective half of the cage. Each bird was provided food, supplements, and water, as mentioned above, but 90 minutes before and during training trials, all food and supplements were removed. The foraging board (21.4 × 14.5 cm) contained 24 wells (1.3 cm diameter × 1.3 cm deep) arranged in a 6 × 4 grid (Fig. 1), and opaque white plastic chips (1.9 cm diameter) were used to cover the wells. Each chip had a piece of string (all string from James Leaver CO., Bristol, UK) attached with glue. The string colour and length depended on the cognitive task: we used 2.5-cm long string of each five colours/types of string in shape training, with the string coiled up and glued on top of the chips using nontoxic wood glue (Henkel Canada Corporation, Ontario, Canada); 1.5-cm and 4-cm green string was used for length discrimination, 2.5-cm white flexible (unpolished cotton) or white stiff (polished cotton) string for flexibility discrimination, and 2.5-cm yellow and blue string for colour discrimination. During the first four steps of shape training, the chips did not have rubber stoppers attached to them. For the final step of shaping and all discrimination training trials each chip was fitted on the bottom with either a rubber stopper (5-mm deep) that fit securely into a well, or white craft putty (iLoveToCreate, California, USA) that stuck the chip to the foraging board while covering a well.

Fig. 1
figure 1

Top-down view of the experimental cage layout during and examples of the foraging board for the four different tasks. A divider (labeled) was placed in the cage to separate each bird during the trials, and the foraging board was provided to one bird at a time. Each bird encountered the four tasks in the order displayed: (1) shaping, (2) length discrimination, (3) flexibility discrimination, and (4) colour discrimination. The chips with string for each phase were randomly placed on the board in the figure to demonstrate what a trial may look like. (See supplementary files for a video of birds completing the different trial types.)

Cognitive tasks

From August 2020 through March 2021, we tested birds in the experiment. We first trained birds with five steps of shape training following Boogert et al. (2008) after which birds completed three discrimination tasks, following procedures similar to (Brust et al., 2013; Brust et al., 2014; Guillette et al., 2015; Jha & Kumar, 2017; Kriengwatana et al., 2015); see Table 1 for a summary of each task. For each training day, we removed all food from each cage, removed and replaced the cage bottom tray (to remove scattered food and supplements), and placed an opaque plastic divider in the middle of each cage to separate paired individuals at 0900 (2 h after lights on). The first trials began 90 minutes after food removal and separation, and each bird experienced six trials per day from 1030 to 1400, each day of the week continuously. For any given trial, each bird was provided a foraging board for up to 5 minutes, followed by a 30-minute intertrial interval. We baited wells on the foraging board randomly using a random number generator.

Shape training

Each bird was shape-trained to access food (three millet seeds) from the foraging board wells by removing a plastic chip from a well over six steps: a habituation phase followed by five steps of shaping. For the habituation phase, a foraging board covered with mixed seed was placed in the bird’s assigned half of the cage 24 h prior to the start of shaping trials. For Step 1, the board had three millet seeds in each of five random wells; Step 2 was similar, but one chip was next to each well that contained food, each chip having one of the five string types used in the experiment; for Step 3, each of the five chips now half-covered the well; for Step 4, each chip fully covered the wells; for Step 5, each chip had bumpers and fully covered the well, meaning that birds had to poke or grasp the chips in order to remove it from the well and access the food. For each of these shaping steps, birds had to eat from 4/5 wells in three consecutive trials to move on to the next step; if a bird accessed no food for six trials on Step 5, it was placed on a remedial step with the bumper chips placed at an angle in the wells. Once a bird passed Step 5, it proceeded directly to length discrimination training, then flexibility discrimination training, and finally the colour discrimination training; each bird participated in the discrimination tasks in the same order.

Discrimination tasks

For each discrimination task, eight chips with string attached were placed randomly over eight wells: four of these wells contained three millet seeds each (S+), while four were inaccessible (S−) because the chips were stuck over them with sticky tack and contained no seeds. Since uncovering S− wells was not possible, birds pecking or pulling on the string or chip that covered a given well counted as “choosing” that well. Birds were considered to “pass” a trial if they chose the S+ in four of its first five choices in a trial, and birds reached criterion on a given discrimination task by passing five of six consecutive trials. For the length discrimination, a 4-cm string was the S+, while a 1.5-cm string was the S−. For flexibility discrimination, rigid string was the S+, while flexible string was the S−. For the colour discrimination, blue string was the S+, while yellow string was the S-. The behavioural measure for each foraging board task was the number of training trials to pass the task (learning speed).

Acoustic operant experiment

Thirty-two birds (16 male; 16 female) that completed the foraging board experiment were then tested in the acoustic operant experiment. The data collected in the acoustic operant experiment was part of another experiment that asked questions about animal welfare, specifically, if a longer feeder window (2 s versus 1 s) in a free-operant paradigm affects discrimination task performance (Sahu et al., 2022; all procedures approved under AUP0001937)


Each bird was housed, for the duration of testing, in a cage (30 × 40 × 40 cm) that was inside a ventilated and sound attenuated operant chamber (Fig. 2). Each chamber contained a full spectrum LED bulb (3W, 250 lm E26, Not-Dim, 5000 K; Lohas LED, Chicago, IL, USA). Each bird had ad libitum access to grit, cuttlebone, and water. Food was available as a reward for correct responses. A motorized feeder with infrared sensors (Njegovan et al., 1994) was present next to a cage opening (11 × 16 cm), which allowed the birds to access the feeder. Infrared sensors were also located on a request perch located near the entrance to the feeder. A personal computer (Desktop PC with Intel Core i5 Processor and 8 GB of RAM) connected to a single-board computer (Palya & Walter, 2001) scheduled trials and recorded responses to stimuli. Stimuli were played from a personal computer hard drive through a Cambridge Integrated Amplifier (model A300 or Azur 640A; Cambridge Audio, London, UK) to a Fostex full-range speaker (model FE108 Σ or FE108E Σ; Fostex Corp., Japan; frequency response range 80–18000 Hz) located inside the operant chamber beside the feeder.

Fig. 2
figure 2

Acoustic operant schematic (top) with the speaker (a), motorized feeder (b), request perch (c), food cup (d), and red light (e). The thick black line represents a ventilated sound-attenuating chamber. Middle panel shows a spectrogram (y-axis = frequency, x-axis = time) of a female zebra finch distance call. Lower panel shows a male zebra finch distance call

Acoustic stimuli

Sixty distance calls of zebra finches (30 male; 30 female) were used as discriminative stimuli (Fig. 2). The calls were obtained from the datasets of (D’Amelio et al., 2017; Elie & Theunissen, 2018), and recordings from Dalhousie University, Nova Scotia, Canada. Acoustic stimuli were broadcast at 75 dB SPL (measured at the request perch).


See Table 1 for a summary of each step of the acoustic operant experiment. First, each bird was magazine trained to use the request perch and feeder, and then pre-training started. Birds were divided into two treatment groups: (1) 1-s group where food was available for one second following a correct response, and, (2) 2-s group where food was available for 2 seconds following a correct response. There were an equal number of males and females in each treatment group. There were two phases during pre-training: Tone Plus Light (TPL) and Tone No Light (TNL). During TPL, once a bird landed and remained on the request perch for at least 10 ms, a 1000-Hz tone 1 second in duration played and a red light illuminated inside the feeder. If the bird then flew into the feeder within 1 second of the tone terminating, it was rewarded with access to food (S+). If the bird left the request perch before the tone was finished playing, the chamber lights went off for 30 s as punishment. There was also an S− (unrewarded) trial during TPL: If the bird landed on the request perch for at least 10 ms and the light inside the feeder illuminated but no tone played, the correct response was to not enter the feeder. If the bird entered the feeder after a light only trial it was punished with the chamber lights turning off for 30 s. During TNL, half of the time when a bird landed on the request perch, only a tone was played. If the bird flew to the feeder once the tone was terminated, it received access to food. In the other half of the trials, no tone played. If the bird entered the feeder on a no-tone trial, it was punished by the chamber light turning off for 30 s. A discrimination ratio (correct responses/correct responses + incorrect responses) was calculated for a block of 500 trials. A bird completed the TPL stage after reaching a DR of 0.8 or greater for two blocks. A bird completed the TNL stage after reaching a DR ratio of 0.8 or greater for three blocks. The goal of these pre-training stages was to train each bird not only to remain on the perch for the entire duration of the acoustic stimulus but also to extinguish responding to the light which would not be used in subsequent training.

Nondifferential training

Each bird was rewarded for responding to each of the 60 distance call stimuli. A bird could trigger a trial by landing and remaining on the request perch for a random interval between 900 and 1,100 ms, after which a randomly selected stimulus played. Each stimulus played once before being played a second time. If the bird entered the feeder within one second after the stimulus playback was completed, it received 1 or 2 s access to food (depending on treatment group) followed by a 30-s intertrial interval. If the bird left the request perch before the stimulus playback was complete, the house light went out for 30 s (interrupted trial). If the bird failed to leave the request perch for 1 second following stimulus completion, a new trial could only be initiated after 60 s, or if the bird left, and then returned to the request perch. The criteria to complete nondifferential training was six blocks, composed of 240 trials, with at least 60% responding across each stimulus and no greater than a 3% difference in response rate to future S+ (rewarded) or S− (nonrewarded) stimuli and future Probe (P+) or Probe (P−) stimuli. Once a bird completed nondifferential training, it moved onto discrimination training.

Discrimination training

Here the procedure was similar to non-differential training with the exception that each bird heard only 40 of the 60 stimuli and was rewarded with access to food for correctly responding to S+ stimuli (female distance calls) and punished with a 30-s intertrial interval with the house light off for responding to S− stimuli (male distance calls). Criteria to complete discrimination training was six blocks (of 320 trials each) with a discrimination ratio (DR) of 0.80 or greater. The DR was calculated by dividing the average percentage of response to S+ stimuli by the average percentage of response to all (both the S+ and S−) stimuli, thus a DR of 1.0 indicates that a bird was only responding to S+ stimuli (i.e., perfect discrimination) and a DR of 0.5 means a bird was responding equally to S+ and S− stimuli. The behavioural measure for this phase was the number of blocks to criterion (learning speed).

Discrimination 85 training

This phase was similar to discrimination training, with the exception that the probability of being reinforced with food following a correct response was reduced to 0.85. The goal of this stage was to each bird that each correct response does not necessarily result in food. Criteria to complete this phase was six blocks (of 320 trials) with a DR of 0.8 or greater.


The probability of reinforcement to training stimuli remained the same in this phase. However, 20 more stimuli, used in nondifferential training but not during discrimination training, were now included: 10 female distance calls and 10 male distance calls called P+ and P−, respectively. These 20 stimuli were neither rewarded nor punished. There were three probe blocks. Each block consisted of 60 trials: 20 S+ and 20 S− from training, as well as the 10 P+ and 10 P−. The behavioural measure for this phase was the discrimination ratio to probe stimuli. During probe testing, each bird continued to respond to previously trained stimuli at a high level (DR to learned stimuli during the first probe, mean ± SD = 0.90 ± 0.09). The goal of the probe phase was to quantify how each bird generalized what it learned from training—that is, to what extent birds would classify non-previously trained male and female distance calls into the correct categories.

Statistical analyses

All analyses were performed using R (Version 4.1.2; R Core Team, 2018). Our primary interest was in analyzing among-individual differences in learning speed (trials/blocks to criterion) of the birds across different tasks, and so our analyses used data from select tasks (Table 1). From the foraging board experiment, we used learning speed from all four components of the experiment: shaping, length discrimination, flexibility discrimination, and colour discrimination. For the acoustic operant experiment, we used learning speed from TPL, TNL, and Discrimination. We excluded the data from the acoustic operant magazine training and 85% discrimination stage because of the low variability within each phase (i.e., most birds passed these stages in the same number of blocks; see Supplementary Information) with a few (extreme, in the case of shaping) outliers. We did not analyze the blocks to criteria from nondifferential training because birds were not learning a discrimination during this phase, rather, they were being trained to respond to all stimuli. Note that performance in the foraging board shaping, TPL, and TNL were likely especially influenced by noncognitive factors, such as neophobia, more than discrimination phases, as part of these tasks was habituating birds to the apparatus and task procedures prior to the discrimination phases; for these reasons, we have conducted some analyses with and without the foraging board shaping in particular (specified in the results section, below).

In the acoustic operant experiment, Mann–Whitney U tests show that the 1-s and 2-s treatment groups did not significantly differ in their average learning speed on TPL (means ± standard error blocks to criterion: 1-s group 5.00 ± 0.44; 2-s group 6.07 ± 0.85; U = 92; p = .40), TNL (1-s group 3.94 ± 0.45; 2-s group 4.00 ± 0.38, U = 94, p = 0.64), and Discrimination training (1-s group 11.81 ± 0.82; 2-s group 13.31 ± 1.35, U = 91, p = .58). Nonetheless, we converted all scores from TPL, TNL, and Discrimination training to z-scores within treatment groups to address the fact that each treatment group experienced a different feeder window length after a correct response to S+ exemplars.

We examined associations between learning speed of the seven different learning tasks mentioned above first using correlation tests. The correlations for the three foraging board discriminations were previously reported in Lambert et al. (2021)—we used Pearson’s r for these correlations, and log-transformed the data for correlations with colour due to the positively skewed outliers in this data (see Lambert et al., 2021). Because much of the acoustic operant data were positively skewed, we used Spearman’s rank correlation (rs) for all other correlations reported here. We then examined the repeatability of individual performance on the seven learning tasks using Gaussian lmm methods with 100 bootstraps (Nakagawa & Schielzeth, 2010); for this repeatability analysis, we also z-transformed the foraging board learning data, meaning we were analyzing the repeatability of individual z-score learning performance. We further examined repeatability within the foraging board discrimination tasks and the acoustic operant tasks. We also examined the extent to which these seven measures of learning speed could be explained by one or multiple principal factors using principal components analysis (PCA), with the variables scaled and centered (i.e., principal components using the correlation matrix).

We then analyzed whether learning speed in the acoustic operant task was related to generalization abilities (i.e., transfer of learning to probe stimuli). We first analyzed repeatability of individuals’ first three probe DRs using the same repeatability analysis mentioned previously. We then used a linear regression model with the DR of the first probe block as the outcome variable, and discrimination speed and treatment group (1 s or 2 s) as the predictor variables. We did not use z-transformed data for this analysis as we used treatment group (1 s or 2 s) as a predictor variable in the model, and we used only the first probe DR because subsequent probe sets may have allowed for learning about the probe stimuli and the DR was significantly repeatable across probes (i.e., individuals performed similarly across the first three probe sets; see Results).

We used α = 0.05 for all tests, and all means presented are means ± SD unless indicated as SE (primarily presented for test results); we did not use corrections due to the overly conservative nature of such corrections for animal behaviour research (Moran, 2003; Nakagawa, 2004). The raw data (Supplementary Resource 1) and R code (Supplementary Resource 2) for our experiments, as well as a video showing passes for each of the discrimination tasks (Supplementary Resource 3) are included as supplementary information.


Thirty-two of 34 birds completed each of the foraging board tasks. Two females failed the final step of shaping after 78 trials and so did not proceed in the experiment. The 32 birds (16 female; 16 male) that successfully passed shaping did so in an average of 30.66 ± 13.03 trials and subsequently completed all three foraging board discrimination tasks and were then used in the acoustic operant experiment. The birds passed the length discrimination in 49.81 ± 22.2 trials, flexibility discrimination in 40.5 ± 16.42 trials, and colour discrimination in 8.62 ± 3.21 trials. Note that three birds (one male; two females) did not reach criterion in the length discrimination and were assigned a maximum score of 98 trials (birds were moved on from length discrimination if not passing by this point).

Twenty-nine of 32 birds completed the acoustic operant experiment. One bird developed an unidentified health issue and died during the experiment. Six birds did not pass magazine training initially and so were restarted from the first step of magazine training again, and four of these birds then successfully completed the acoustic operant experiment. The 29 birds took 5.17 ± 1.83 blocks to pass the TPL; 3.97 ± 1.59 blocks to pass TNL; and 12.48 ± 4.06 blocks to pass Discrimination. For the probe trials, birds had an average DR of 0.57 ± 0.27 for the first probe, 0.53 ± 0.30 for the second probe, and 0.42 ± 0.25 for the third probe; note, however, that birds only responded to an average of 5.05 ± 1.71 of the 20 probe stimuli within any probe session (and only 10 of the 20 stimuli belonged to the S+ category).

Relationships between the learning tasks

The correlation matrix showing the r/rs values and accompanying p-values and confidence intervals are in Table 2. The correlations among the three foraging board discrimination tasks were previously reported in Lambert et al. (2021), except for the correlations involving shaping. In brief, all three foraging board discrimination tasks were positively correlated, but only the correlations between length and flexibility were statistically significant. Our new correlations with shaping found only a significant negative correlation between shaping and length discrimination (Fig. 3). For the acoustic operant tasks, each of the three tasks were positively correlated, but the only significant correlation was between TPL and auditory discrimination. Correlations across the two experiments were largely positive, but the only significant relationship was between flexibility discrimination and auditory discrimination (Fig. 3).

Table 2 Correlation matrix for the seven learning tasks across two experiments, with p values in parentheses and confidence interval in brackets. Note. Pearson’s r was used for correlations between length, flexibility, and colour (denoted with *), while Spearman’s rank correlation (rs) was used for all other correlations. n = 32 (df = 30) for all correlations within the foraging board tasks (Shaping, Length, Flexibility, Colour, marked via f in table). For all other correlations n = 29 (df = 27). Acoustic operant tasks (o in table) are z-transformed by treatment group (1s or 2s)
Fig. 3
figure 3

Scatterplots displaying the significant correlations between learning tasks. Each point represents an individual, and the axes of each plot displaying the trials/blocks to criterion for a given task. Correlations and significance were tested using Spearman’s rank correlation (rs), with a negative correlation between length discrimination and shaping (rs = −0.35, p = .0496), a positive correlation between vocal and flexibility discrimination (rs = 0.42, p = .02), and a positive correlation between TPL and vocal discrimination: rs = 0.49, p < .01). Smaller numbers means the task was learned faster

Repeatability analysis of the z-scores of the seven learning tasks found significant individual repeatability in performance (R = .22 ± 0.07 [SE]; CI [0.06, 0.35]; p < .01; Fig. S1), suggesting individual’s learning speeds relative to each other were consistent across the tasks. Because shaping was distinct from the other seven tasks as it did not involve discrimination, and because it was negatively correlated with length, we examined repeatability excluding shaping and found a similar result (R = .27 ± 0.08 [SE]; CI [0.11, 0.42]; p < .01); repeatability was higher when examining only the three foraging board discrimination tasks (R = .37 ± 0.11 [SE]; CI [0.14, 0.54]; p < .01) or only the three acoustic operant tasks (R = .42 ± 0.13 [SE]; CI [0.16, 0.62]; p < .01).

This significant repeatability was further supported by our PCA—the first principal component accounted for 36% of the variance across the seven learning tasks, and all tasks loaded negatively onto this component—meaning that an increase on this component was associated with an increase in performance (lower/faster learning speed) across all of the seven tasks (Table 3). There were two other significant (eigenvalues >1) principal components, but there was no clear pattern to the loadings of these factors except perhaps that the three foraging board discriminations all loaded similarly on the second principal component, indicative of the positive correlations between these three tasks.

Table 3 Results from the principal components analysis of the seven learning tasks, presenting only those components with eigenvalues >1.

Learning speed and probe performance

We did find evidence that individual performance on the probe trials was repeatable, with R = .25 ± 0.13 (SE); CI [0.02, 0.48]; p = .03. Discrimination ratio of the first probe was not correlated with the learning speed of the acoustic discrimination (effect ± SE: 0.006 ± 0.01; t25 = 0.01, p = .63; Fig. 4; note that one bird did not respond to any probe stimuli and so n = 28 for the first probe), and the 1-s and 2-s groups did not differ in their DR performance (effect ± SE: −0.17 ± 0.10; t25 = −1.65; p = .11; see Fig. 4).

Fig. 4
figure 4

Performance in the first probe trial compared to vocal discrimination learning speed, indicating no correlation between the two (p = .63). Vocal discrimination learning speed represents the blocks to criterion, while probe discrimination ratio represents the discrimination ratio (go responses to P+ [novel female calls], divided by all go responses) during the first probe session; a discrimination ratio > 0.5 indicates birds classified probe stimuli as belonging to the correct category (male or female distance call) more often than not. Each dot represents one individual, with the exception of two points representing birds with identical scores at (11, 0.67) and (19, 0.5); n = 28 birds


We found that birds that learn physical cognition tasks more quickly also learn auditory discrimination tasks more quickly. We measured learning speed in the same male and female zebra finches across seven different tasks. Using a foraging board in the animals’ home cage we quantified trials to criteria in (1) shape training, (2) a length discrimination, (3) a flexibility discrimination, and (4) a colour discrimination. Using a free-operant procedure in which the birds lived and worked in an acoustic operant chamber we quantified blocks to criteria in (1) tone-plus-light training, (2) tone-no-light training, and (3) acoustic discrimination between male and female vocalizations. We found some correlations in learning speed and significant repeatability calculated across all seven tasks, suggesting a potential general learning mechanism across the different tasks that could be considered a cognitive trait. The results of our PCA further suggested some common cognitive mechanism involved across the different learning tasks as we found that all seven tasks loaded unidirectionally onto a first component that accounted for 36% of the variation. Our findings show that learning speed is repeatable across behavioural testing contexts that measure learning in different sensory-cognitive domains. Specifically, the foraging board involved visual and sensorimotor information potentially useful in nest building and the acoustic operant experiment involved acoustic information with potential social relevance. Repeatability was higher within each of the two different types of tasks (foraging board and acoustic operant) compared to across all tasks, providing some—albeit limited—evidence of distinct domains or learning mechanisms in these different tasks. Furthermore, we did not find evidence for learning speed affecting generalization of learned stimuli (i.e., a speed–accuracy trade-off sensu; Sih & Del Giudice, 2012).

Evidence for domain-general learning

Both our repeatability analysis and PCA provide evidence that individual birds’ learning performance translates across tasks, potentially suggesting the different learning tasks involve some common cognitive mechanism(s) and/or domain-general learning ability. We found significant repeatability of performance both within each experiment as well as across all seven learning tasks. Repeatability within foraging board tasks and acoustic operant tasks was moderate, providing evidence that bird’s learning speed was moderately repeatable even with different types of discriminations (i.e., structural and visual within foraging board experiment and acoustic and visual within the acoustic operant experiment) within each experiment; Kriengwatana et al. (2016) found similar results in acoustic discriminations—namely, a strong significant correlation between performance on two different auditory discrimination tasks. Furthermore, birds’ learning speed was repeatable across all seven tasks, suggesting that birds that performed above/below average on one task were more likely to perform above/below average on even very different tasks. Our repeatability value across seven different tasks is interesting given that a meta-analysis of individual repeatability on the exact same task found a range of R = .15–0.28 (Cauchoix et al., 2018)—very similar to our own findings, even though our findings include different tasks whereas the meta-analysis focused on repeatability within the same tasks.

Our PCA findings further align with our repeatability measure in suggesting some evidence of our tasks requiring a common mechanism and/or domain-general learning. As mentioned in the introduction, other studies have found variable evidence for g in animals (Burkart & van Schaik, 2016) and the only meta-analysis on the topic found a median of 32% with a range of 17%–64% for variance explained by the first principal component (Poirier et al., 2020), very similar to the 36% of the variance explained by PC1 in our study. What does this mean? This could be evidence of g and indeed lines up with how g is considered and defined in other research—and so this finding may provide evidence of some cognitive mechanism (or groupings of mechanisms) that might be called a domain-general cognitive ability. However, there are additional factors to consider. First, g is thought to span many types of sensory and cognitive domains, including spatial learning and inhibitory control. Our tasks primarily involved visual, structural, and auditory information, across two different contexts, limiting our ability to generalize to other sensory/cognitive domains. In light of the similarities between our different tasks—six of the tasks involved discriminating between different types of stimuli—our findings may indicate not some domain-general intelligence but rather some very specific cognitive trait that is engaged by all of these tasks (Shuker et al., 2017). Additionally, because all of the tasks involved food reinforcement, it may be that the reward system of the brain was the (or one of the) common mechanism(s) engaged by each of these tasks, such that repeatable learning performance across the tasks might be explained by some aspect of an individual’s reward system (Arias-Carrián et al., 2010). Whether the reward system can be thought of as domain-general, domain-specific, or perhaps even a noncognitive factor such as motivation (see below), is uncertain, though it has been argued that reward-seeking behaviour was key in the evolution of domain-general cognitive mechanisms (Chiappe & MacDonald, 2005) and that the expansion and integration of the reward system was key in human cognitive evolution (Previc, 1999).

Alternatively, similar performance across tasks may represent a noncognitive trait that similarly affects performance on different tasks, such as motivation (Macphail, 1985; Völter et al., 2018). Although our methods take great lengths to control for motivation by following standardized food deprivation procedures and ensuring that birds reach consistent levels of responding prior to undertaking discrimination trainings (although we include foraging board shaping in our analyses), separating out motivation from learning ability is still difficult (Rowe & Healy, 2014) and motivation can have demonstrable effects on performance in different tasks (Cooke et al., 2021).

Evidence for domain-specific learning

Though we did find evidence for repeatable learning speed across different tasks, this repeatability was low to moderate, and our findings still leave open the possibility of other variables or potentially distinct cognitive domains affecting performance across different tasks. Repeatability looking only at the forging board discrimination tasks or acoustic operant tasks was higher than when including all tasks that spanned domains, which suggests that these different experiment discriminations involved different cognitive mechanisms—though it can be difficult to separate differing mechanisms from the different contexts. The correlations between the different learning tasks also provide some evidence of different mechanisms involved. Particularly, length and flexibility discrimination—the only two tasks designed to test physical cognition (i.e., discrimination based on structural properties that are relevant in nest-building contexts) were significantly positively correlated, while colour discrimination—the other discrimination task conducted using the foraging board—was not significantly correlated with either of the physical cognition tasks measured using the same apparatus, though this could be because of the strong floor effects in colour discrimination. This provides some (weak) evidence that these physical cognition tasks relied upon mechanisms distinct from those used in colour discrimination. However, auditory discrimination was positively correlated with both flexibility and TPL discriminations, which is more in line with the evidence that bird’s learning abilities translate across different contexts and domains, as these three discriminations involve visual and sensorimotor information (the flexibility discrimination), and auditory information (in the TPL and auditory discriminations).

Shape training on the foraging board was negatively correlated with length discrimination, indicating that birds that took a long time on shaping learned length discrimination relatively quicker compared with birds that were fast to learn the shape training. What this means is unclear—since birds proceeded to length discrimination directly after shaping, it is possible that birds who quickly passed shaping had not learned the affordances of the task as well and so took longer to learn the length discrimination. Alternatively, it is also possible that some other third variable explains this relationship; for example, birds that pass shaping quickly may be more active or bold, and higher activity/boldness could be negatively correlated with something like inhibitory control (Dougherty & Guillette, 2018) as a key part of all of these discriminations is withholding response to the S− (errors). However, shaping is only negatively correlated with length and so learning the length discrimination would have to somehow taper these effects from subsequent discriminations. It should also be noted shaping is the task most distinct from the other six included learning tasks as it does not involve discrimination of any sort.

Lack of evidence for learning-speed–learning-generalization trade-off

We did not find evidence that auditory discrimination learning speed predicted the ability to transfer or generalize learning to new stimuli. Most birds did not seem to generalize their learning to the probe stimuli, as the average DR was near 50% for all probes. Some hypotheses suggest that fast learners might be less accurate and potentially less flexible, and therefore less adept at transferring or generalizing what they have learned, while slower learners might better generalize their learning (Sih & Del Giudice, 2012)—our results do not support this speed–accuracy trade-off as learning speed did not predict transfer of learning to new stimuli. However, since so few birds generalized successfully at all it is hard to say anything conclusively other than that the birds did not seem to learn the “rule” that female distance calls were rewarded but potentially learned and memorized individual calls (Yu et al., 2020) that were rewarded or unrewarded.


In conclusion, our study found some correlations between individual performance on different tasks and weak-to-moderate but significant repeatability across our seven learning tasks. Our findings suggest either that either (a) zebra finches possess some domain-general learning ability that translates across a variety of different tasks and may be considered a cognitive trait, (b) all of our tasks involved some common cognitive trait that similarly influenced performance across each of them, but is not necessarily domain-general but rather specific to the learning tasks of our experiment, or (c) some other noncognitive factors explains the individual repeatability in performance. Our study represents another step towards identifying the cognitive constructs/domains we are actually analyzing and determining what cognitive mechanisms truly differ consistently between individuals.