This study was approved by the Ethical Review Group at the University of Exeter (No. 2012/533), and the experiment was carried out in accordance with the Association for the Study of Animal Behaviour and Animal Behaviour Society guidelines and UK law.
Subjects and housing
Five squirrels, living in the laboratory, participated in this study. They were named Arnold, Leonard, Sarah, Simon and Suzy and included two females and three males. Their mean age was 6 years; see Supplementary Materials Table S1 for further information on each squirrel. The temperature in their housing was controlled at a constant 19 °C, and lighting was on a 12-h:12-h day–night cycle, with all testing conducted during the light period. The squirrels were housed in large cages that were constructed using metal mesh. In each cage, there was a sliding metal door connected to an overhead tunnel. Only one squirrel was allowed access to the test room at a time for this experiment. A metal mesh divided the test room into two equally large cages (each 1.5 × 1.8 × 2.5 m). The front and ceiling of the cages were metal mesh, whereas the side and the back of the cages were solid concrete wall. One cage had a touch screen panel, set 2 m above the floor as reported in Chow et al. (2017). A camera (Panasonic SWHD-90) was set up in the adjacent cage to capture all behavioural responses during the experiment. Further details of the housing and test room set-up are given by Hopewell et al. (2010). All the squirrels had similar experimental histories in cognitive tasks (see Table S1 in supplementary materials for details). Within the 22 months prior to the present study, the squirrels did not interact with the puzzle box used by Chow et al. (2016) or any similar problem-solving task, nor were they exposed to similar designs as enrichment; they did participate in a serial spatial reversal learning task, as reported by Chow et al. (2015). The squirrels were not food-deprived, and water was provided ad libitum. We ensured squirrels’ motivation by using rewards (hazelnuts) that were different from their daily diet (seeds, fresh fruit and vegetable). Doors allowing the squirrels to enter the test room by the overhead tunnel from their home cages were opened during the times of day when they were most active (0700–1100 and 1500–1800), and tests were carried out when a squirrel entered the test room spontaneously. Data collection took place between May and July 2015.
Puzzle box for the recall task
Figure 1a shows the puzzle box that was presented to squirrels by Chow et al. (2016), 22 months before the present experiment; we used the same box for this experiment. The box was a transparent Plexiglas cuboidal box (length 25 × width 25 × height 25 cm) that had 10 holes on each side. Ten levers (each lever 29.8 × 1.5 × 0.5 cm thickness), five functional (baited with hazelnuts) and five non-functional (without hazelnuts), were inserted across the box through holes in opposite sides. The holes (2 cm × 0.9 cm W × H) on the box were designed to be larger than the thickness of a lever (0.5 cm), so that squirrels could see and smell the nuts but could not directly reach them after a lever was inserted. At one end of each lever, there was a three-sided container, and this was positioned just inside the box. Four wooden legs were used to support the box, creating a 4.5-cm gap through which squirrels could obtain the hazelnuts once they fell out of the containers. Although squirrels could use many types of behaviours to solve the task, the apparatus was designed so specific behaviours were effective (i.e. the most efficient way) for obtaining a nut and specific other behaviours could not solve the task. The specific effective behaviours were pushing the ‘near-end’ of a lever and pulling the ‘far-end’ (near- and far-end refer to proximity to the hazelnut bait), while the specific ineffective behaviours were pulling the ‘near-end’ of a lever and pushing its ‘far-end’.
Puzzle box for the generalisation task
Figure 1b, c shows the apparatus used in the generalisation task. It was a transparent puzzle box in the shape of a four-sided triangular prism (triangle front 35 × 19 × 18 cm; length × width × height, rectangular side 25 × 20 cm) with five levers inserted. The puzzle box had completely different physical characteristics and colour than the one used in the recall task, but it still involved moving levers, so that we could examine whether squirrels applied the learned effective and ineffective tactics to obtain the nuts. The length of the levers was shorter than in the recall task, and both ends of each lever were slightly curved (lever dimensions 23.5 × 2 × 0.2 cm L × W × H). The generalisation box had 5 holes (2 × 0.9 cm) on each side, which were horizontally but not vertically aligned with holes on the opposite side. Because squirrels showed a strong preference for choosing the functional levers (with hazelnuts) both in the original (Chow et al. 2016) and in the recall tasks (see results section), we further increased the difference between the recall and generalisation task by including only functional levers. As Fig. 1c shows, both lever ends protruded 1.5 cm out of the box. The box was supported by four wooden legs, creating a 3.5-cm gap from its base. The base of the box (32 × 10 × 3 cm) was a wooden sloped platform (in silver grey colour) which allowed a nut to roll down once it had fallen. As in the recall task, squirrels could see and smell the rewards but could not reach them directly. Squirrels were able to emit the same effective and ineffective behaviours on each lever to obtain a nut: pulling the near-end or pushing the far-end of a lever was ineffective, so they had to push the near-end or pull the far-end.
Squirrels first participated in the recall task, so we could examine whether they remembered the puzzle box they had experienced 22 months ago. The generalisation task was presented 6 days later so as to examine whether squirrels could transfer the same effective behaviours to a physically different box. We kept the same procedures as in Chow et al. (2016) for both the recall and the generalisation tasks; squirrels were tested individually to avoid confounding factors such as stimulus enhancement or social learning in the task. Each squirrel participated in three blocks of four trials in each task (for a total of 12 trials), with a 1-day break between each block (for a total of 14 testing days). In each trial, we placed the box at the centre of the test room. A trial started when squirrels touched or manipulated any part of the box. The trial ended when squirrels completed the task by obtaining all the rewards, when they had not touched the apparatus for 15 min, or when 45 min had elapsed, whichever came first. If a squirrel did not respond, we repeated the trial the next day. This only happened with one squirrel, Suzy, in one trial in the recall task. After every trial, we removed the odour left on the apparatus using disinfectant-impregnated cleaning wipes. We also used wipes after baiting in order to minimise any human scents left on the apparatus. For both tasks, the orientation of the apparatus and the direction the levers faced were pseudo-randomised between trials. For the recall task, we additionally randomised whether a given lever was functional or not. A single success at solving the problem was defined as a squirrel causing a functional lever and/or a nut to drop. A trial therefore normally consisted of five successes.
For both the recall task and the generalisation task, we measured the latency from when a squirrel entered the test room until it first used its nose or paws to touch the apparatus. We measured the contact latency on the last trial of the recall task and on the first trial of the generalisation task as neophobia. This allowed us to test whether the squirrels perceived the pyramid-shaped apparatus as a novel stimulus in the generalisation task.
We also measured the time taken to obtain each reward; this was used as a measure of problem-solving efficiency. Latency was timed from the moment when a squirrel started to manipulate a functional lever until the nut it contained dropped. Not every manipulation of a functional lever led to success, but the time spent in unsuccessful manipulation on it was still included. For each trial, we summed all the latencies on functional levers and then divided this total success latency by the number of functional levers that a squirrel solved during that trial, to obtain the mean latency to each success.
Measurement of behavioural traits
The four behavioural traits, persistence, motor diversity, selectivity and flexibility were measured using methods standardised by Chow et al. (2016). The first author analysed all behaviours from videos using the software Adobe Premiere Pro CS6; this allowed us to analyse behavioural data on a frame-by-frame basis. The behavioural measures of each trait co-vary with one another, and it is therefore necessary to tease them apart analytically to avoid multicollinearity. The measures also need to be normalised in some way, since the longer a trial lasts, the more opportunity there is for a behaviour to be performed. Accordingly, rates of occurrence of behaviours rather than raw counts were used, as in previous experiments (e.g. Biondi et al. 2008; Chow et al. 2016; Griffin et al. 2014; Griffin and Diquelou 2015; Papp et al. 2015). All measurements were taken trial-by-trial for each task (12 trials). For the recall task, we recorded the measures on the functional levers only, to allow direct comparison with the generalisation task in which only functional levers were used.
Selectivity was measured as the proportion of effective behaviours. We counted the number of effective behaviours (i.e. either pushing the near-end or pulling the far-end of a functional lever) and the number of ineffective behaviours (i.e. either pushing the far-end or pulling the near-end of a functional lever) in each trial. Then, we divided the number of effective behaviours by the total number of effective and ineffective behaviours for that trial.
Persistence has been used to assess motivation (e.g. Biondi et al. 2008; Chow et al. 2016; Griffin et al. 2014). We measured persistence as the rate of attempting to solve the problem. An attempt was recorded whenever a squirrel used any of its body parts to manipulate a functional lever, regardless of whether the manipulation was exhibited as effective or ineffective behaviours directed at the box. A new attempt was counted when squirrels switched to a different functional lever or when the squirrel returned to manipulating the same lever after at least one second without having its body in contact with the lever. We counted the total number of attempts in each trial on all functional levers and then divided this number by the total success latency as defined above.
Motor diversity was measured as the rate of using different tactics in solving the problem. We used Chow et al. (2016)’s Table 1 to code the tactics that squirrels used within solving a functional lever. Nine types of behaviour were coded: pull, push in, push up, push down, tilt up, claw, lick, shake and any of two or more of these behaviours occurring simultaneously (combined behaviours). We obtained the rate of motor diversity for each trial by counting the number of types of behaviours that a squirrel exhibited during a trial (ranged from 1 to 9) and then dividing this number by the total success latency for the trial, as defined above.
Flexibility was measured as the rate of switching between tactics. A switch was counted whenever a squirrel changed from any of the tactics listed in motor diversity to a different one, regardless of whether either of the tactics involved was effective. We first counted the number of switches between tactics and then divided this number by the total success latency, as defined above, to obtain the rate of flexibility in each trial. To further examine squirrels’ retrieval strategies, we measured the mean number of ‘non-productive’ switches (i.e. switches from effective to ineffective behaviours) across functional levers.
We used R version 3.3.2 (R Core Team 2016) to analyse all behavioural data. All significance levels reported are two-tailed and were considered as significant when P < 0.05.
For the recall task, we used exact binomial tests to examine whether each squirrel was significantly more likely to direct attempts at functional levers (baited with hazelnuts) than at non-functional levers (without hazelnuts). We then pooled the P values using Fisher’s formula χ
2 = −2 Σ In(P) (Sokal and Rohlf 1995 p. 794). For the generalisation test, we used a Wilcoxon signed-rank test to assess differences in contact latency from the recall test, and Spearman’s correlation to examine relationships between contact latency and mean success latency on the first trial.
We used generalised estimating equations (GEE) with exchangeable ‘working’ correlation (Hardin and Hilbe 2003) to investigate (1) whether the mean latency to each success in the first trial of the recall task differed from the mean latency to each success in the first trial and the last trial of the original task; (2) whether the mean latency to each success in the last trial of the recall task differed from the mean latency to each success on the first trial of the generalisation task; (3) how the mean latency to each success varied across trials in each task; (4) how each behavioural trait (rate of attempts, rate of flexibility, rate of motor diversity and proportion of effective behaviours) varied across trials; and (5) how the behavioural traits contributed to increasing efficiency in the recall task and in the generalisation task, separately. GEE is a quasiparametric statistical test for model estimates. Because small sample size leads to underestimation of the variance of parameter estimates, we obtained the P values using the package ‘geesmv’ (Wang 2015), which adjusted the modified ‘sandwich’ variance estimator (Wang and Long 2011) for estimating the variance–covariance matrix of the parameter estimates. This modified variance has been shown to be robust for experiments that have very small sample size with each individual completing all trials, as in our case.
We used Pearson correlations to explore the relationships between covariates before model testing. Attempt rate and motor diversity were highly correlated in the recall tasks (r = 0.78) and in the generalisation task (r = 0.86). High correlation was also shown between attempt rate and selectivity (r = 0.53) for the generalisation task. To avoid confusion in interpreting the results due to multicollinearity, and, in line with the primary focus of this study on memory for task-effective behaviours, we selected variables for model estimations as follows. We included attempt rate, selectivity, switch rate and trial number for the recall task, but excluded attempt rate was excluded from the model estimation for the generalisation task, because, given the high level of accuracy, it was confounded with the other traits.