Ethical approval for this study was obtained from the ‘Ethik und Tierschutzkommission’ of the University of Veterinary Medicine Vienna (Protocol number ETK-16/09/2017, ETK-20/09/2017). Informed consent was obtained by all owners of the pet dogs. The authorization to test the free-ranging dogs was provided by the municipality of Taghazout (Morocco).
Subjects
Pet dogs (Pd). Mixed-breed pet dogs were tested in private homes in Austria. The subjects were recruited from both the Clever Dog Lab database and via social media. A total of 20 pet dogs were tested (13 F, 7 M; mean age in years: 6.3 ± 0.6 SE).
Free-ranging dogs (FRd). Free-ranging dogs were tested in their natural environment in the municipality of Taghazout, Agadir, Morocco. The experimenters (ML, LD and RM) travelled by car to look for solitary dogs (solitary dogs were chosen to avoid interference by conspecifics). Only adult dogs (appearing to be over 1 year of age) were tested. A total of 62 dogs performed at least one test condition. A total of 31 dogs were excluded from the analyses, because other dogs interfered during the test. Hence, a total of 31 free-ranging dogs (12 F, 19 M) were included in the analyses. The tested free-ranging dogs were village dogs living around human settlements. They were well socialized with humans and had daily experience of humans near their food sources (mainly garbage). Many of them also experienced receiving food directly from humans. However, while the dogs might relate humans to food, they did not receive help from humans in obtaining food.
Apparatus
The apparatus consisted of a wooden board (length: 1 m, width: 0.5 m) with four overturned transparent and perforated containers baited with three types of food simultaneously (dry food, sausage and cheese). Three out of the four containers could be moved (possible bowls), whereas the fourth one was attached to the board (impossible bowl). To avoid habituation to the apparatus, we used different shapes of containers for each condition (bottom of a rigid plastic bottle, top of a rigid plastic bottle, or Tupperware box) that were counterbalanced across dogs and conditions (e.g. a subject experienced the Tupperware box as the container in condition 1, and the rigid plastic bottle top as the container in condition 2, whereas another subject might have experienced the rigid plastic bottle bottom in their condition 1 etc.). The objects were chosen to be at least somewhat familiar to the free-ranging dogs.
Testing procedure
All pet dogs were tested in all four test conditions: (1) social; (2) dummy; (3) object; (4) alone (80 tests in total with Pd). 14 of the free-ranging dogs were tested in the social and alone conditions, while 17 naïve dogs were tested only in the ‘dummy’ human condition (45 tests in total with FRd). Depending on the test condition (social, dummy, object), the experimenter, a ‘dummy’ human (Han Solo figure, width: 59 cm, height: 186 cm) or a ‘dummy’ object (width: 64 cm, height: 188 cm) were standing at 1.5 m behind the apparatus. Both the objects and the experimenter were unknown to the dogs (see Fig. 1). Where we were able to test the same dogs twice the conditions were counterbalanced across dogs.
Pet dogs (Pd). Subjects were tested in their owner’s homes in Vienna. The animals were initially moved to a different room before the apparatus was placed in the testing room. Once the apparatus, cameras and where applicable, the human, the human-shaped cardboard and the object were in place, the dog was led into the room. For the social condition, the experimenter stood at a distance of 1.5 m from the apparatus looking at her phone during the entire test, while the owner was waiting in a separate room. For the other conditions, both the owner and the experimenter left the house after giving a ‘goodbye’ signal to the dog according to the usual routine of the specific dog–human dyad. All tests were recorded using two cameras with one camera being remotely controlled to observe the subject in the three conditions in which it was left alone in the room.
Free-ranging dogs (FRd). Free-ranging dogs were tested in the streets and on the beaches of Taghazout. Once a subject was located alone, the experimenter placed the apparatus on the ground, taking care not to be seen by the subject. The experimenter then stood at 1.5 m behind/next to the apparatus (for the social condition) or hid in the car (control conditions). A second experimenter went to the dog, petted it for a few seconds and then walked towards the apparatus making sure that the dog followed. The experimenter did not show the apparatus to the dog, but simply walked past it and then got into the car. The test started when the dog approached the apparatus. All tests were filmed from the car or from the experimenter standing in front the apparatus (social condition).
The tests started when the dog approached the apparatus (i.e. when they were within 10 cm of the apparatus) and ended if the subject stopped interacting with the apparatus (sniffing it or manipulating it) for 5 min. Thus, the whole test duration was not fixed, but determined by the behaviour of the subject (Online Resource 2). We tested pet dogs and free-ranging dogs in different environments (indoor and outdoor), because the common and most important feature was that both groups were tested in their most familiar environment, where it was assumed that they would feel most comfortable.
Analyses
All the videos were coded using the software Solomon coder (developed by András Péter, Dept. of Ethology, Budapest, www.solomoncoder.com). See Table 2 for definitions of the coded behaviours.
Table 2 Detailed description of the coded behaviours Inter-observer reliability was carried out between three observers, each coding 20% of the video data (Intra-class correlation coefficient: persistence ICC = 0.99, looking back frequency ICC = 0.81, looking back duration ICC = 0.82, looking back latency ICC = 0.96).
For statistical analyses, we used generalized linear models (GLM) and generalized linear mixed models (GLMM). All models were fitted in R (version 3.6.1; R Core Team 2019) using the functions lm (R package stats), lmer (R package lme4) (Bates et al. 2014), glmmTMB (Brooks et al. 2017) and coxme (Therneau 2019). Model residuals of Gaussian models were tested for normality and homogeneity using diagnostic plots. Where the initial model did not fit the assumption of normally distributed residuals ,(models P1, P2, P3, L2, L3) we applied the Box-Cox transformation method, using the package MASS (Venables and Ripley 2002), and the appropriate transformation was applied to the response variable to achieve normally distributed residuals (log transformation for models P1, P2, P3, L3 and square root transformation for model-L2) (Venables and Ripley 2002). However, we decided to show in the graphs the non-transformed data. Collinearity of predictors, assessed applying the function vif of the R package car (Fox et al. 2012), appeared not to be an issue (Quinn and Keough 2002). Overdispersion appeared not to be an issue (range of dispersion parameters 0.19–1.17) except for models DL2, DL3, W2, W3 where we applied a function kindly provided by Roger Mundry to correct SE, z-, and P values for individual predictors. We assessed model stability on the level of the estimated coefficients and standard deviations by excluding the levels of the random effects one at a time (Nieuwenhuis et al. 2012). Overall, all models except model-F1, model-F2 and model-F4 were of moderate or good stability (Online Resource 1). For models including more than one predictors, P values for the individual effects were based on likelihood ratio tests comparing the full model with the respective reduced models lacking the model predictors (R function ‘anova’) (Barr 2013).
Results were supplemented with Bayes factors, which were computed with the BayesFactor package (Morey and Rouder 2018) using the functions anovaBF and lmBF. For models DL1, DL2, DL3, W1, W2, W3 Bayes factors were manually calculated using the BIC approximation (Wagenmakers 2007). Whenever non-significant results were found using frequentist inference statistics, the null hypothesis cannot be rejected. Bayesian statistics allow a determination of whether the data provide stronger evidence for H1 or the null hypothesis (H0). The value of the Bayes factor (BF) indicates the number of times the data are more likely under the H1 hypothesis than under the H0 null hypothesis. A BF higher than one gives stronger support to the H1 hypothesis than the H0 hypothesis, while a BF smaller than one is in support of the H0 hypothesis rather than the H1 hypothesis. Conventionally, a BF > 3 can be interpreted as substantial evidence, whereas a BF > 10 is considered strong evidence (Rouder et al. 2018; Lee and Wagenmakers 2014). Plots were created in R using the package ggplot2 (Wickham 2009).
Persistence. The subjects that did not interact with the impossible bowl (Pd: object 1; FRd: social 1, alone 1) were excluded from the analyses of persistence, as persistence refers specifically to the duration of interacting with the impossible bowl. Two generalized linear mixed models (model-P1 for Pd and model-P2 for FRd) were run with persistence as the response variable, test order and condition (social, alone, object, dummy for Pd; social and alone for FRd) as explanatory factors and subject ID as random factor. To investigate differences in persistence between Pd and FRd in the presence of the human experimenter (social condition), a linear model (model-P3) was run with persistence as the response variable and group (Pd, FRd) as explanatory factor. The null models lacked the predictor condition for the comparison with model-P1 and model-P2. We calculated Bayes factors for condition in model-P1 and model-P2 and for group in model-P3.
Latency of looking back. In these analyses, we considered the latency to look back after attempting the impossible bowl once all the reachable food was eaten (see Table 2). The subjects that did not interact with the impossible bowl after the food was eaten were excluded from these analyses (1 Pd, 5 FRd). To investigate the differences in the latency of looking back between conditions (social, dummy and object) in Pd, we ran a Cox mixed-effects model (model-L1). A survival response variable was constructed using the Surv function (Therneau 2015), considering the latency to look back (or termination of the experiment) and whether this event occurred or not. Subject was included in the model as a random factor. Given that in the social condition all Pd looked back, to ensure model convergence we considered one subject in the social condition (with the longest latency to look back) as not having performed the behaviour. All FRd that finished the food and attempted the impossible bowl (ten social, nine dummy) looked back at the experimenter or at the dummy human, except one, which was excluded from the next analysis. To investigate the differences, in the latency to look back, between conditions in FRd, we ran a linear model (model-L2) with latency to look back as the response variable and condition (social, dummy) as the explanatory factor. To investigate differences in the latency to look back between Pd and FRd in the presence of the human experimenter, we ran a linear model (model-L3) with latency to look back as the response variable and group (Pd, FRd) as the explanatory factor. We calculated Bayes factors for condition in model-L1 and model-L2 and for group in model-L3, excluding the subjects that did not look back.
Effect of condition and group on the frequency of looking back after attempting the impossible bowl. The subjects that never attempted the impossible bowl were exclude from these analyses (Pd: object 1; FRd: dummy 3). To investigate the differences in the frequencies of looking back after attempting the impossible bowl between conditions in Pd and FRd, a generalized linear mixed model for Pd (model-F1) and a generalized linear model for FRd (model-F2) with a quasibinomial distribution were run with the occurrence of looking back (see Table 2) as the response variable, normalized by the total number of times the subject attempted the impossible bowl, condition (social, dummy, object for Pd; social and dummy for FRd) as the explanatory factor and subject as the random factor (only for model-F2). To investigate the differences in the frequencies of looking back at the experimenter after attempting the impossible bowl between pet dogs and free-ranging dogs, a binomial model with a quasibinomial distribution (model-F3) was run with the occurrence of looking back as the response variable, normalized by the total number of times the subject attempted the impossible bowl, and group as explanatory factor (this analysis was run only for the social condition). The null models lacked the predictor condition for the comparison with model-F1.
Effect of group and the obtainability of food (possible vs. impossible bowl) on the frequency of looking back. These analyses were run on the whole test duration, only for the social condition. We investigated the differences in the frequencies of looking back between Pd and FRd, after attempting either the possible or the impossible bowl. We ran a generalized linear mixed model (model-F4) with a binomial distribution with the occurrence of looking back (see Table 2) as the response variable, normalized by the total number of times the subject looked up (see Table 2) after attempting the bowl. The group, the attempted bowl (possible or impossible) and their interaction were included as explanatory factors. The null model lacked both predictors.
Duration of looking back and emotional arousal. These analyses were run on the whole test duration. All tested subjects were included in these analyses. To investigate whether the proportion of time individuals looked back or tail wagged at the experimenter differed between conditions, we ran two Generalized Linear Mixed Models for Pd (model-DL1 and model-W1) and two Generalised Linear Models for FRd (model-DL2 and model-W2) with beta error structure and logit link function. We included condition (social, object, dummy for Pd; social, dummy for FRd) as explanatory factor and subject as a random factor (only for model-DL1 and model-W1).
To investigate whether the proportion of time individuals looked back or tail wagged at the experimenter differed between Pd and FRd in the social condition, we ran two Generalised Linear Models (model-DL3 and model-W3) with beta error structure and logit link function. We included group (Pd, FRd) as an explanatory factor. For model-DL3, to account for possible more distractions in FRd than in Pd, which were tested outdoors, the response variable was the total time that the subjects looked at the experimenter divided by the total time the subjects looked up (see Table 2). For all the other models (model-DL1, model-W1, model-DL2, model-W2, model-W3) the response variable was the total time that the subjects looked or tail wagged divided by the total duration of the test.
The null models lacked the predictor condition for the comparison with model-DL1 and model-W1. We calculated Bayes factors for condition in model-DL1, model-W1, model-DL2, model-W2 and for group in model-DL3 and model-W3.