Introduction

A growing body of evidence suggests that several corvids possess highly developed cognitive skills. Much research has illuminated these skills through investigations into caching behaviour. For example, Western scrub-jays (Aphelocoma californica) can remember the what-where-and-when of specific past events (Clayton and Dickinson 1998), plan for future needs (Correia et al. 2007; Raby et al. 2007) and use personal experience to predict the behaviour of others (Emery and Clayton 2001), and both scrub-jays and ravens (Corvus corax) use information about others’ knowledge state to inform their cache protection strategies (Bugnyar and Heinrich 2005; Dally et al. 2004, 2005, 2006; Emery et al. 2004). In addition to these studies on the cognitive control of food-caching, magpies (Pica pica) have been shown to pass the mirror mark test of self-recognition (Prior et al. 2008), and rooks (Corvus frugilegus) have been observed displaying social support to individuals that have just been in a fight (Seed et al. 2007), and spontaneously complete a task requiring cooperation (Seed et al. 2008), suggesting high levels of empathy and social cognition in a variety of scenarios.

Corvids appear to particularly excel in physical cognition. Lefebvre et al. 2002 report instances of tool-use in 24 corvid species, and 14 of the 39 avian species exhibiting ‘true tool-use’ are corvids (although these reports are mostly anecdotal). The most studied example of tool using in corvids is that of the New Caledonian crows (Corvus moneduloides) who habitually use and manufacture tools in the wild (e. g. Hunt et al. 2002; Holzhaider et al. 2008). In laboratory studies, these crows have demonstrated a good understanding of the physical properties underlying their tool-use (Chappell and Kacelnik 2002; Chappell and Kacelnik 2004; Taylor et al. 2009), meta, or sequential, tool-use (the use of one tool to retrieve another; Taylor et al. 2007; Wimpenny et al. 2009) and tool-manufacture from unfamiliar materials (Weir et al. 2002). Interestingly, Rooks—which are not natural tool-users—have also passed these tests of understanding (Seed et al. 2006; Tebbich et al. 2007) and recently have been found capable of not only tool-use, but also choosing appropriate tools for different tasks, creative tool-manufacture and sequential tool-use (Bird and Emery 2009a).

The fact that Rooks appear to rival habitual tool-users in their cognitive understanding of physical problems suggests that complex physical cognition may not be an adaptation for tool-use, but that tool-use only emerges in species whose ecology demands it. The New Caledonian crow lives in an environment in which the absence of woodpeckers leaves an available ecological niche, namely the extraction of highly desirable protein-rich grubs from narrow crevices (Hunt et al. 2002), whereas rooks live in an environment where nutritious food, such as carrion, is more readily available (e.g. Lockie 1956) and extractive foragers such as woodpeckers are plentiful. Rooks also possess beaks adapted to digging in earth for insects. It therefore seems likely that physical cognition, having evolved for some purpose other than tool-use, is expressed differently in two closely related species depending upon their habitat. Sophisticated physical cognition might thus be found in other non-tool-using members of the corvid family, outside of the Corvus genus to which both New Caledonian crows and rooks belong.

The present study follows from research conducted by Bird and Emery (2009b), inspired by Aesop’s fable, “The Crow and The Pitcher”. In this tale, a thirsty crow used stones to raise the level of water so that it could reach it to drink. To test whether a corvid would be capable of such a feat, Bird and Emery presented rooks with a number of stones and a clear plastic tube containing water and a floating food item. The birds readily used the stones to raise the level. Subjects were also found to learn to use a few large stones rather than several small ones and drop the stones in water rather than sawdust.

Here, we first investigate whether a fellow corvid, the Eurasian jay (Garrulus glandarius), can solve such a task. This is a question of evolutionary importance: if the jays were found to rival rooks in their performance, this finding would suggest that physical cognition is not limited to the Corvus genus, to which the rooks and New Caledonian crows belong, but is present in at least one other genus of corvidae, Garrulus, and given that these genera are only distantly related, may suggest a wider distribution among the corvid family.

The second objective is to investigate the psychological mechanisms that underlie this performance. This issue will be explored in a series of six experiments, each providing the subject with different cues. Experiments 1 and 2 investigate whether the subjects are able to learn the conditions necessary for a displacement event. In Experiment 1, subjects choose between inserting stones into a tube of liquid substrate (water) or a tube containing either a solid substrate (woodchips) or no substrate at all (both are baited with a worm). In Experiment 2, the subjects must choose between inserting an item that sinks or an item that floats into a tube of water baited with a worm. Hence, in these two experiments, the choice is between a tool or apparatus that is functional according to physical principles and one that is not. In Experiment 3, the birds choose between inserting stones into a tube filled with red woodchips and a tube filled with blue woodchips, both of which are baited. For one of these tubes, the correct number of insertions results in a reward, whereas in the other no food reward is ever given. This task replicates the reward schedule of the previous experiments, but is stripped of causal information and movement cues. Experiment 4 consists of a choice between two tubes of water, one of which is baited with a worm and one of which is not. Here, the causal and movement cues are the same as in Experiments 1 and 2, and the reward schedule is comparable, but only action into the baited tube can be considered “goal-directed”. Experiment 5 presents the subjects with a choice between two apparatuses that mimic the movement cues of water, but are not reliant on mechanical causation; for each stone insertion into the “correct” apparatus, a hidden experimenter moves the worm closer to the subject, while an insertion into the incorrect apparatus causes no movement. This experiment replicates the movement cues and reward schedule of Experiments 1 and 2 but offers the subjects no causal cues by which to infer how the apparatus “works”. Finally, Experiment 6 presents the subjects with an apparatus in which insertion of a stone into the “correct” tube of water causes the water level to rise in the adjacent (baited) tube. This experiment provides the subjects with the same reward schedule and movement cues as Experiments 1, 2 and 5, but functions in a way that is designed to violate any causal expectations that may exist (we shall, for the sake of brevity, refer to this as presenting “counter-intuitive” causal cues hereafter.)

By presenting subjects with a series of experiments that manipulate the information that is available to facilitate learning, we can establish, from the pattern of performance, what information is necessary for the birds to learn. From this, we may be able to infer the mechanisms by which such learning occurs. For example, if subjects simply perform any action within their repertoire towards perceptually available food items, they might be expected to only solve Experiment 4, but none of the other experiments in this series. At the other extreme, if the birds had a full appreciation of physical causation and were using this and only this to guide their behaviour, then one might expect the jays to struggle on tasks that involve no such causation or counter-intuitive causal cues (e.g. Experiments 3, 5 and 6) but perform well on the others (Experiments 1, 2 and 4). Figure 1 summarises the perspectives from which the subjects may approach the task and the pattern of task performance that might be predicted for each one of them.

Fig. 1
figure 1

Models of task performance given different learning heuristics. Boxes marked in black indicate tasks that would be predicted not to be passed or to take significantly longer to pass than boxes marked in white

General methods

Subjects

Five hand-raised sub-adult Eurasian jays were tested: Romero, Hoy, Ainsley, Wiggins and Hunter (sexes unknown). All of the birds were between 1 and 2 years old at the time of testing and pair-housed in a climate-controlled room in cages measuring 2 m × 1 m × 1 m, in the Sub-department of Animal Behaviour at the University of Cambridge. The birds were maintained on a diet of cat food, egg, vegetables, nuts, seeds and fruit and kept on a 12:12-h light/dark schedule.

Experimental conditions

There were considerable individual differences between subjects in terms of their motivation to achieve a food reward and their degree of neophobia towards the apparatuses. Consequently, there were differences in the experimental conditions. Ainsley and Wiggins were cage-mates who tended to displace each other from the apparatuses but would work well in isolation. However, they were highly neophobic of the apparatuses and would only interact with them when hungry. As a result, these birds were tested in isolation after 2 h of food deprivation. By contrast, Romero and Hoy were very motivated and inquisitive and did not need to be hungry to interact with an apparatus, but would not work when isolated. As such, they were tested with their cage-mate still present (cage-mates rarely approached the apparatus and did not seem to affect performance beyond facilitating approach by the experimental bird) and without food deprivation. Hunter was Hoy’s cage-mate and was prevented by the more dominant bird from approaching the apparatus, and thus could only be tested in isolation, and was not food deprived for Experiment 6, but when willingness to approach apparatuses decreased in following experiments, food deprivation for 2 h was introduced. All birds had access to water ad libitum throughout the experiments.

Analysis

Trials were recorded using Geovision GV-1480 CCTV © 2006 and coded for stone and item insertions. The second coder coded 15% of trials for each experiment, chosen at random. Inter-observer reliability was calculated using linearly weighted Cohen’s Kappa at http://faculty.vassar.edu/lowry/kappa.html. Performance in each experiment was assessed using binomial tests at http://faculty.vassar.edu/lowry/binomial.html. These tests were performed at the level of the trial, using the number of correct stone insertions the bird made in a given trial as a proportion of the total number of stone insertions made in that trial.

Pre-training

Methods

Apparatus

All of the subjects were trained on the apparatus used by Bird and Emery (2009a; Fig. 2) in which a stone dropped down a Perspex tube would dislodge a platform, releasing a worm. During this training, they learned to drop stones down tubes, but not in the context of water.

Fig. 2
figure 2

Illustration of the sets of apparatus used in each experiment

Procedure

There were three levels of training. First, a stone was presented on a platform above the tube, such that it was possible to accidentally knock it in. Second, the stone was placed on the floor such that subjects had either to place it on the platform themselves, and then knock it in, or to pick it up and drop it directly into the tube. Finally, the stone was placed on the floor with no platform attached to the tube. The subjects then had to pick up the stone from the floor and drop it directly into the tube. Subjects were progressed to the water-level task if they reached criterion on this training task. Criterion consisted of five consecutive stone insertions on the third training level. This criterion was relaxed for Hunter who, despite not reaching criterion on the training task, spontaneously took part in Experiment 6.

Results and discussion

The five birds showed considerable individual differences in their ability to perform with the training apparatus. Table 1 shows the number of times each subject interacted with the apparatus (that is to say, picked up the stone, pecked the apparatus etc.) before reaching criterion. Ainsley and Wiggins are substantially slower at reaching criterion than Hoy and Romero, and Hunter never reaches criterion. These results may be partly explained by the reduced willingness of Ainsley, Wiggins and Hunter to approach the training apparatus, such that even when they were able to successfully use the apparatus, achieving 5 consecutive stone insertions took time. Given the individual differences in the training procedure, one might expect to see a similar pattern of performance in the main tasks, with birds that performed better and more readily on the training task also being more successful on the main tasks.

Table 1 Number of interactions each bird had with the training apparatus before reaching criterion

Experiment 1

Introduction

Because the birds had been trained to insert stones into a tube, it was important to investigate whether they were capable of transferring this insertion behaviour to a water-filled tube (as this could be simple generalisation) but whether they would choose correctly between a water-filled, and therefore functional, tube and a tube that was filled with another substance and therefore non-functional. Thus, we primarily investigated preference (in terms of the number of stones dropped) for the functional tube rather than willingness to insert stones into a tube of water, or success at actually retrieving the food reward from the water. To control for the possibility of stimulus generalisation, two versions of this experiment were presented: one in which the non-functional tube was very visually different from both the water tube and the training apparatus (Experiment 1a; water vs. woodchip) and one in which the non-functional tube was more visually similar to the training apparatus than the water tube (Experiment 1b; water vs. empty).

Methods

Subjects

All birds took part in Experiment 1b. All birds except Hunter took part in Experiment 1a.

Apparatus

Two Perspex tubes of 15 cm height and either 4.5 or 3 cm inner diameter depending on the size of the bird were used (Hoy and Romero received the 4.5 cm tubes, Ainsley, Wiggins and Hunter received the 3 cm tubes.)

Tool items were large stones (for the 4.5 cm tube; mean weight: 16.6 g) or small stones (for the 3 cm tube; mean weight: 8.5 g); each stone would raise the water level by 4 mm. The level of the water or other substrate was adjusted for each subject with relation to the height at which they could reach the worm without using any stones (i.e. the reachable height). The subjects differed in their physical stature and consequently how far they could reach down into the tube; thus, each bird had a different reachable height. This was calculated by presenting the birds with a baited water tube repeatedly with iterating water levels. No stones were presented to the birds during this measure and no calculation involved more than 6 iterations.

Procedure

In Experiment 1a, two tubes were presented, one filled with water (the ‘functional tube’) and the other filled to the same level with fine woodchip (the ‘non-functional tube’). Both tubes were baited with a wax-worm (wax-moth larva, Galleria mellonella). Although semi-buoyant, the wax-worms were tied to a small cork float to reduce the risk of sinking. Ten stones were placed equidistantly between the tubes. For each subject, the level of the substrate in each tube was lowered by a set amount from the reachable height such that each subject needed to use the same number of stones to be successful.

Trials ended if subjects successfully retrieved the worm or were unsuccessful after 10 min. Trials in which the subject did not approach the apparatus were re-tested. Five consecutive non-approach trials ended testing and the subject’s data were discarded. Each subject was given 15 trials. The number of stones needed to gain a reward was varied between high, medium and low (needing approximately 6, 4 and 2 stones, respectively) pseudo-randomly across the 15 trials and the starting level of the water and substrate varied accordingly.

Experiment 1b followed the same design except that the ‘non-functional tube’ was in this case empty, and the worm stuck (with Blu-tac©) to the side of the tube at the same level as the water.

Results and discussion

There was high concordance between observers scoring of stone insertions as reflected by a Cohen’s kappa score of (k = 0.87) for Experiment 1 overall.

Experiment 1a

Both Hoy and Romero learned a significant preference for the water tube by the end of 15 trials (binomial test, P = 0.03 and P = 0.04, respectively); Wiggins did not demonstrate a preference (binomial test, P = 0.187) and Ainsley demonstrated a preference for the woodchip tubeFootnote 1 (binomial test, P = 0.043).

Experiment 1b

Both Hunter and Ainsley’s data were excluded from Experiment 1b due to unwillingness to approach. Hoy learned a significant preference for the water tube by the end of 15 trials (binomial test, P = 0.02), but Romero did not (binomial test, P = 0.18; although he did, numerically, put more stones into the water tube). Wiggins did not insert a sufficient number of stones to assess preferences.

When the results of Experiments 1a and 1b are aggregated, both Hoy and Romero showed a significant preference for the water tube (binomial test, P = 0.002 and P = 0.009, respectively) but Wiggins did not (binomial test, P = 1). Figure 3 shows the trial-by-trial performance for each of the birds.

Fig. 3
figure 3

Trial-by-trial performance in Experiment 1. Insertion order runs from top to bottom, and trial order runs from left to right

These data indicate that two of the birds learned a preference for dropping stones into a tube containing water, rather than a tube containing woodchips or an empty tube. This suggests that they can come to appreciate one of the necessary conditions for success in a displacement task: that items should be inserted into a liquid. Experiment 2 assessed their appreciation of the second necessary condition: the insertion of a sinkable object.

Experiment 2

Introduction

Displacement does not simply occur if you insert any object into liquid. The choice of object for insertion is just as important as the choice of substrate in which to insert. Here, we investigated whether the two birds that had showed preference for a functional substrate would also show preference for a functional object.

Methods

Subjects

Only Hoy and Romero took part in this experiment.

Apparatus

A single 3 cm-inner diameter Perspex tube was used. Tool items were pieces of rubber and pieces of low-density polyurethane foam (mean weights 0.72 and 0.17 g, respectively), all of constant shape and size. Because foam and rubber of identical colour could not be found, each material was presented in a variety of colours such that colour was not a reliable discriminating stimulus. The possible colours for foam were green, yellow, red or blue. The possible colours for rubber were green or yellow. The colour combinations used were varied randomly between trials. When inserted, the pieces of rubber raised the water level by 4 mm each, but the pieces of foam floated.

Procedure

In Experiment 2, Hoy and Romero were presented with a single tube of water and given the choice of 12 items, 6 pieces of rubber (sinkable) and 6 pieces of foam (non-sinkable/floating). These items were placed in rows beside the tube with distance from tube controlled. The number of items needed to gain the worm was set at the medium level (approximately 4). All else was as in Experiment 1.

The two birds had varying degrees of experience with the objects. Hoy had previously used rubber to raise the level of water on 35 occasions (in an experiment comparing insertions of rubber to insertions of stone, unreported data), and Romero had used rubber with water once. Thus for Hoy, rubber could be considered a “familiar” tool, and for Romero, rubber could be considered a “quasi-novel” tool. Both had experience of foam in their cage, but not in the context of the water tube.

Results and discussion

Both observers scored the birds’ stone insertions identically, and therefore, Cohen’s kappa showed perfect inter-observer reliability (k = 1). Hoy and Romero learned a preference for sinkable rubber over floating foam over the 15 trials (binomial test, P = 0.000 and P = 0.000, respectively). Indeed, Hoy had developed a significant preference by the 10th trial (P = 0.000) and Romero by the 5th trial (P = 0.006). Figure 4 shows the trial-by-trial performance for both birds. While this result could have been due to familiarity for “Hoy”, “Romero” had only used rubber once in a previous experiment, making this less likely.

Fig. 4
figure 4

Trial-by-trial performance in Experiment 2. Insertion order runs from top to bottom, and trial order runs from left to right

Experiment 3

Introduction

There are a number of ways in which the results of Experiments 1 and 2 can be interpreted. The simplest explanation for Hoy and Romero’s performance is that they were able to ascertain which of the tubes/items was “rewarded” and thus directed their actions towards this tube (i.e. Model B). Experiment 3 aimed to investigate whether this is a plausible possibility.

Methods

Subjects

Hoy, Romero and Wiggins took part in this experiment.

Apparatus

Experiment 3 used the same apparatus as Experiment 1.

Procedure

Experiment 3 followed the same procedure as Experiment 1, except that the tubes both contained woodchip of different colours. One tube (and hence one colour of woodchip) was randomly assigned as the “rewarded tube” (counter-balanced between subjects). The tubes were thus perceptually different and rewarded differently, but according to an arbitrary, rather than a physical, rule. The number of stones needed to retrieve a worm was yoked to how many the bird had actually inserted in Experiment 1a, and the starting levels of the woodchips were adjusted accordingly. As with the previous experiments, worms were present in the tubes at all times during the experiment, but the reward-worm was delivered by an experimenter placing it at the base of the rewarded tube after the correct number of stones had been inserted.

Results and discussion

There was high concordance between observers’ scoring of stone insertions. As such Cohen’s kappa showed high inter-observer reliability (k = 0.93). Wiggins’ data were excluded from analysis due to unwillingness to approach. Neither Hoy nor Romero showed any preference for the rewarded tube after the full 15 trials (binomial test, P = 1 and P = 0.726, respectively). Figure 5 shows the trial-by-trial performance for both birds.

Fig. 5
figure 5

Trial-by-trial performance in Experiment 3. Insertion order runs from top to bottom, and trial order runs from left to right

Experiment 4

Introduction

Experiment 4 examined whether stones insertions were goal-directed; presenting birds with a choice between a baited tube and an unbaited tube. Goal-directed action would be characterised by a preference for stone insertions into the baited tube. If “act towards food” (i.e. Model A) were the only concept available to the jays, then they should pass this task easily, but in tasks in which both tubes were baited they should direct any behaviours (for our purposes, stone insertions) indiscriminately towards both tubes. Furthermore, in this experiment, insertion of a stone into either tube results in movement (i.e. the raising of the water level); thus if the subjects were responding principally to interesting movement (i.e. Model C), rather than movement of food, they should not discriminate between the tubes.

Methods

Subjects

Hoy and Romero, Ainsley and Wiggins took part in this experiment.

Apparatus

The same apparatus as that described in Experiment 1.

Procedure

Two tubes of water were provided, one baited with a worm and the other not baited. Ten stones were placed equidistantly between the tubes and the number of items needed to gain the worm was set at the Medium level (approximately 4). All other factors remained the same as that described in Experiment 1.

Results and discussion

Cohen’s kappa showed perfect inter-observer reliability between observers scoring of stone insertions (k = 1). Data from Ainsley and Wiggins were excluded from the analysis due to unwillingness to approach. Both Hoy and Romero learned a preference for the baited tube by the end of 15 trials (binomial test, P = 0.000 and P = 0.000, respectively). For Hoy, this preference appeared within the first trial (binomial test, P = 0.000) and for Romero by the 10th trial (binomial test, P = 0.000). This result suggests that these two jays were sensitive to the movement of the worm, rather than movement per se (as this was present in both tubes), and that overall their performance was goal-directed. Figure 6 shows the trial-by-trial performance of both birds.

Fig. 6
figure 6

Trial-by-trial performance in Experiment 4. Insertion order runs from top to bottom, and trial order runs from left to right

Experiment 5

Introduction

The results of Experiments 3 and 4 suggest that movement of the worm is necessary information for jays learning a displacement task. This experiment investigates whether such movement cues are sufficient for learning, in the absence of causal cues (i.e. Model D).

Methods

Subjects

Hoy, Romero, Wiggins and Hunter took part in this experiment.

Apparatus

The apparatus used in Experiment 5 consisted of one horizontal and one vertical tube, 3 cm inner diameter, joined at the base to form an upright “L” shape. The back of this “L” shape was mounted on an opaque Perspex backboard. Two thin dowel sticks attached at one end to either side of a bottle cork were inserted along the horizontal tube and through two small holes in the backboard such that the cork sat inside the horizontal tube and the free ends of the sticks protruded out the back end of the apparatus. A worm was attached to the bottle cork. The apparatus was placed at the back of the cage such that the sticks protruded out of the cage-back and were manipulable by an experimenter, unseen behind the opaque cage backing (Fig. 2). Movements by the experimenter of the stick back and forth resulted in movement of the worm back and forth in the horizontal tube.

Procedure

Subjects were presented with two apparatuses. When the subject inserted a stone into the vertical tube of the “moving” apparatus, the experimenter pushed the stick 4 mm forward, thus moving the worm 4 mm forwards towards the aperture of the horizontal tube, through which the bird could access it. Stone insertions into the vertical tube of the “non-moving” apparatus resulted in no movement. Ten stones were presented in a pile in front of the apparatuses, equidistant from the vertical tubes. One possible interpretation for the null result obtained in Experiment 3 was that colour is not a valid discriminating stimulus for birds. As such, all subjects received two sets of 15 trials with this apparatus, one with colour of the base and backboard as the discriminating stimulus, with position counter-balanced, and one with position as the discriminating stimulus and colour controlled. The order in which these were presented was counter-balanced between subjects; Romero and Wiggins received colour followed by position, Hoy and Hunter received position followed by colour.

In Romero’s first exposure to this apparatus, he tended to insert the stone into the horizontal tube towards the worm rather than the vertical tube. Because this type of problem can be considered unique to this apparatus (as it is the only one where the stone can be inserted into two different places), he was retested once he had acquired the correct behaviour.

Results and discussion

Cohen’s kappa showed perfect inter-observer reliability between observers’ scoring of stone insertions (k = 1). Both Wiggins’s and Hunter’s data were excluded from the analysis as they did not insert any stones.

Experiment 5a

When colour acted as the discriminating stimulus, Hoy learned a preference for the apparatus which caused the worm to approach after 10 trials (binomial test, P = 0.05) and Romero did not show this preference on his first attempt in terms of stones inserted into the vertical tube over the 15 trials (binomial test, P = 0.5), but did in term of stones pushed into the horizontal tube (binomial test, P = 0.000). On his second attempt, he showed a preference in terms of stones inserted into the vertical tube by the 10th trial (binomial test, P = 0.01).

Experiment 5b

When the discriminating stimulus was position, Hoy, but not Romero, was successful in learning a preference for the apparatus in which the worm moved after 15 trials (binomials test, P = 0.0001 and P = 0.445, respectively). Hoy’s preference had reached significance by the 10th trial (binomial test, P = 0.0003).

When the results of Experiments 5a and 5b are aggregated (including all three attempts for Romero), Hoy, but not Romero, showed a significant preference for the moving apparatus (binomial test, P = 0.000 and P = 0.07, respectively). Figure 7 shows the trial-by-trial performance for each bird.

Fig. 7
figure 7

Trial-by-trial performance in Experiment 5. Insertion order runs from top to bottom, and trial order runs from left to right

These data suggest that both Hoy and Romero learned a preference, in terms of total number of stones dropped, for an apparatus that caused the approach of food, over one that did not. This may suggest that causal cues are not necessary for these Eurasian jays to choose the appropriate behaviour to achieve reward. When combined with the results from Experiment 3, these data suggest that movement cues may be important for the jays’ to learn the appropriate behaviour to achieve reward. Furthermore, it suggests that these birds are able to use colour cues to differentiate between apparatuses and that as such the results of Experiment 3 cannot be interpreted as an inability to discriminate according to colour.

Experiment 6

Introduction

While the results of Experiment 5 suggest that causal cues may not be necessary for a jay to learn to raise the level of water, this does not by itself mean that these birds lack any causal knowledge, simply that they are able to learn a similar task using movement cues alone. In other words, this does not allow us to differentiate between Model C, which proposes that movement cues are necessary, but causal cues have no effect, and Model D, which proposes that movement cues are necessary, and there is also an effect of causal cues. Experiment 6 aimed to investigate whether there was any contribution of causal knowledge to performance. Here, we investigated whether causal knowledge would retard learning on a task in which the movement cues were present but the causal cues were confusing. If so, this would suggest that causal knowledge makes a contribution to learning on water-level tasks, despite not being completely necessary for it.

Methods

Subjects

Hoy, Romero, Wiggins and Hunter.

Apparatus

The apparatus consisted of three tubes, two of 3 cm inner diameter and one of 1.3 cm inner diameter. One of the wider tubes was joined at the base with the narrow tube, such that liquid could pass between them and a stone inserted into this tube would raise the water level of both tubes. This join was hidden by the apparatus’ opaque base such that all that was visible was two wide tubes with a narrow tube between them (see Fig. 2). The stones were too large to fit inside the narrow tube. The functional and non-functional tubes were differentiated by the shape and colour of the markings at their bases (for example, a red triangle and a blue square) and the colour of the markings around their rims.

Procedure

The birds were presented with the apparatus. The narrow tube was baited with a worm. As such, insertion of a stone into the “functional” tube would bring the worm closer to the tube aperture, but insertion of a stone into the other tube would not. Ten stones were presented in a pile in front of the apparatus, equidistant from the wider tubes. So that the results of this experiment could be considered comparable to those of Experiment 5, subjects were presented with 15 trials using 1 colour-shape discrimination (Experiment 6a), followed by a second 15 trials using a different colour-shape discrimination (Experiment 6b).

Results and discussion

There was high concordance between the two observers’ scoring of stone insertions. As such, Cohen’s kappa showed high inter-observer reliability (k = 0.91).

Experiment 6a

None of the subjects tested learned a preference for the rewarded tube in terms of total number of stones inserted over the 15 trials (binomial tests: Hunter; P = 1, Hoy; P = 1, Romero; P = 0.918). Wiggins did not insert enough stones to analyse.

Experiment 6b

Wiggins’s and Hunter’s data were excluded from Experiment 6b due to unwillingness to approach. Neither Hoy nor Romero learned a significant preference for the rewarded tube over the 15 trials (binomial test, P = 0.211 and P = 0.59, respectively).

When the results of Experiments 6a and 6b are aggregated, neither Hoy nor Romero showed a significant preference for the rewarded tube (binomial test, P = 0.347 and P = 0.824, respectively). These results suggest that confusing or surprising causal information may be able to retard the jays’ learning on a task that presents very similar movement cues to Experiments 1 and 2. Figure 8 shows the trial-by-trial performance of all birds.

Fig. 8
figure 8

Trial-by-trial performance in Experiment 6. Insertion order runs from top to bottom, and trial order runs from left to right

General discussion

Our results indicate that there are considerable individual differences between Eurasian Jays in their ability to learn to use simple tools to receive a reward. All but one of the birds were able to learn to drop a stone into a tube to receive an immediate reward; however, only two of the five birds were able to use objects appropriately to receive rewards that gradually approach, while 3 were not.

These two birds showed a preference for the use of functional tool items (sinkable objects) and substrates (water) in tasks in which objects must be inserted into a substrate in order to raise its level to bring a floating food item within reach. In transfer experiments aimed at identifying what information these individuals used to solve these tasks, it was found that they were able to learn to insert stones into an apparatus that caused the approach of the food without apparent mechanical causation, but not when the reward was delivered without such movement cues or when causal cues were confusing.

Figure 9 describes the jays’ performance across all of the experiments. This pattern of performance most closely resembles that predicted by “Model E” in Fig. 1, namely “Do the action that causes the movement of food. Choice of action is affected by, but not reliant on, some causal knowledge about what is happening”. In other words, these results appear to indicate that both instrumental learning of actions that cause food to approach and some concept of what “should” be happening (i.e. causal knowledge) contribute to performance at the water-level task. If subjects had relied purely on instrumental conditioning, one would expect them to be equally successful in tasks with congruent causal cues and in tasks with almost identical visual feedback but incongruent causal cues. If they relied entirely on causal understanding or “insight”, then one might expect them only to be successful in tasks with available and congruent causal cues. This idea is expanded below.

Fig. 9
figure 9

Pattern of actual task performance. black squares represent tasks in which a preference for the rewarded tube/apparatus was not apparent from total items inserted, white squares indicate a preference for the rewarded tube/apparatus, and grey squares indicate an unclear result (i.e. approached, but did not reach, significance). Dotted squares indicate tasks not performed. The order in which the experiments were undertaken is indicated by the number in the box. Here, we see that Ainsley, Hunter and Wiggins appear unable to learn, or unwilling to participate in any of the tasks, while Hoy and Romero’s pattern of task performance most closely resembles the pattern of performance predicted by model E. That is, that both instrumental conditioning and causal knowledge contribute to learning

Not “pure” conditioning

If a task is learned instrumentally, then the ability to acquire a behaviour should rely on the perceptual and biological salience of the stimulus, action and the reinforcer. When two tasks involve identical actions, and near identical stimuli and reinforcers, one might expect slight variation in the speed of acquisition, but one would not expect one task to be learned and the other not to be by the same individual.

To analyse our findings in this light, we must consider the issue as to what the “stimulus” and “reinforcer” in these tasks could be considered to be.

The first possibility is that a subject could simply learn that insertion of stones into one tube is eventually rewarded with food, while insertion of stones into the other is not. However, the inability of both Hoy and Romero to pass a task in which this was the only cue suggests that this was not what they had learned.

A second possibility is that the subjects learned that insertion of a stones into one tube causes the incremental approach of a food reward, while insertion of a stone into the other does not. This type of “incremental” conditioning has recently been demonstrated by Taylor et al. (2010a) as a pre-requisite for New Caledonian Crow string pulling. They demonstrated that subjects were only able to learn the sequence of actions necessary for successful string pulling if they had visual access to the food item approaching as they pulled. The authors note that while this is a “simple” account for an apparently complex behaviour, it is possible that it requires more sophistication than basic conditioning since other species—such as siskins and goldfinches—are in general unable to learn this task despite being able to learn things instrumentally. This account would predict that in the absence of causal cues, or in the presence of confusing causal cues, as long as the reward approached incrementally with each stone insertion, the task would still be learned. A purely instrumental task appreciation of this kind would therefore predict that the birds would perform similarly in Experiments 1, 5 and 6.

Not “pure” understanding

Taking the reverse perspective, if the birds’ performance was entirely the product of causal knowledge, the question then arises as to how a preference was learned for the rewarded apparatus in Experiment 5, where no causal cues were available. Furthermore, we would not expect to see a gradual learning effect but rather a leap from failure to success once the causal properties have been realised.

Comme ci, comme ça

Our results seem to indicate that Hoy and Romero rely on a combination of instrumental conditioning and causal cues to solve the water-level task. Theories of associative learning emphasise the salience of both the stimulus and the reward as factors affecting learning and some believe this to be determined by the attention paid to them (e.g. Pearce et al. 1980). In terms of the Rescorla-Wagner learning rule (Rescorla and Wagner 1972), it could be argued that the salience of the reward (β) is substantially affected by prior knowledge or experience of the form that it will take—i.e. if causal cues are meaningful to the subject, this will increase the salience of a reward whose form is congruent with such cues. A non-cognitive example of this could be an experiment in which three rats learn to press a lever: one has been previously taught that food is delivered in the left magazine; one has been taught that food is delivered in the right magazine; and one is naïve. If, in the actual experiment, the food was in fact delivered in the right magazine, you might expect the rat with incongruent prior knowledge to learn more slowly than the naïve rat and the rat with congruent prior knowledge to learn more quickly than the naïve rat—not because the task is inherently harder or easier for any of these individuals, but because prior knowledge of the magazines—would affect the delay between the action and perception of the reinforcer, thus affecting learning. We suggest that some elements of causal understanding may act like “prior training” in these circumstances, facilitating or impeding learning of tasks according to whether they fit with the physical “laws” as they are understood by the subject. This “causal expectancy bias” could allow for more efficient trial-and-error learning of tasks with inherent causal properties by creating an attentional bias for cues that are causally relevant.

Could the same account apply for other forms of tool-use?

We tend to concur with (von Bayern et al. 2009) that “insight” can never be a satisfactory explanation for any behaviour as “this label avoids identifying the exact processes by which a solution is obtained” (p. 1966). Although many researchers have criticised the attribution of “insight” as an explanatory mechanism (see Shettleworth 2010 for a recent overview), it is only recently that researchers have begun to conduct detailed analyses of the psychological processes underlying the nature of tool-use acquisition. It may be that a mutual conditioning-causal knowledge explanation (such as the causal expectancy bias) could be applied to other examples of tool-use.

Let us take the example of chimpanzee nut-cracking. When cracking nuts, a chimpanzee may have to hit a nut several times to be successful.Footnote 2 The nut will not stay in the same condition after every impact but become increasingly cracked. So, just as the raising of the water level could act as an incremental reinforcer for the birds, increasingly cracked nuts could act as an incremental reinforcer for the chimpanzees. There is evidence that tool-using primates appreciate the necessary properties for a stone-cracking tool (Boesch and Boesch 1983; Visalberghi et al. 2009). However, all these studies used animals with extensive experience with nut-cracking and did not follow how these behaviours were learned. An appreciation of what tool is appropriate for a job does not belie that this appreciation may have been acquired instrumentally—after all, hitting a nut with an inappropriate tool does not lead to incremental cracking and is therefore unrewarded. However, as Taylor et al. (2010a) point out, if instrumental conditioning were the only mechanism by which such behaviours are learned, then why is tool-use limited to so few species and, in the main, to the species which also show intelligence in other domains? Our suggestion of contributions to learning from both casual knowledge and instrumental conditioning may provide an answer to such questions.

Conclusions

Like the Rook, the Eurasian Jay is not a tool-user in the wild and has until now never been observed to use tools in captivity. Regarding our first objective, the goal-directed manipulation of objects demonstrated in this study has implications for not just for this species, but for the corvid family at large. These findings may be taken to suggest that physical cognition evolved much earlier in the corvid lineage than previously thought, as the Garrulus genus is only distantly related to the Corvus genus in which these skills have previously been found.

Regarding our second objective, namely an investigation into the possible mechanisms controlling this behaviour, the findings from this series of experiments suggest that it may be useful instead of attempting always to distinguish “simple” instrumental conditioning from higher cognitive understanding to explore the areas where these two systems work together, in mutual facilitation (see Taylor et al. 2010b for a similar argument). Furthermore, it is useful to identify what information is used by animals to learn complex behaviours; from this, we are able to clarify the precise nature of the learning mechanisms involved. In doing so, this approach suggests an avenue for further investigation into tool-use across a wide variety of species, as well as for a range of other cognitive tasks.