Animals and housing
Giving that housing and previous negative interactions with humans seem to have an impact on the psychological development of horses (Fureix et al. 2012; Baragli et al. 2014), the subjects were selected based on conventional training procedures and appropriate housing.
The experimental design was conducted in April 2017 at the “Pelliccia” Riding Centre (San Marcello Pistoiese, Tuscany, Italy). The tested animals (14 horses of different ages and breeds, see Online Resource 15) were selected based on features already defined in the pilot study. The horse selection was made on the basis of their propensity to familiarity towards people and confidence with the arena in which the test would be performed. Moreover, the predisposition to explore unfamiliar objects was taken into account (see Baragli et al. 2017 for details). The horses were stabled in individual stalls and had paddock turnout every day in a social environment; they showed no stereotyped behaviors and had the same feeding schedule (ad libitum access to hay and water, grass during paddock turnout and concentrates one time a day).
This study was carried out in accordance with the recommendations of the Italian Animal Care Act (Decree Law 26/2014). The Ethical Committee on Animal Experimentation of the University of Pisa approved the experimental protocol (ref. n. 62,131). The owners gave written consent to the use of their horses in this experiment.
The testing area
The entire experimental design has been performed in a covered arena (the “round pen”, typical circular enclosure usually employed for the training of horses). The arena has been divided into two parts: the testing area and an out-of-testing area (behind the mirror and lateral areas).
The testing area has been further divided into four areas. These four areas have been defined depending on the relative position respect to the mirror (Fig. 1). The starting position was in line with the mirror and the space on the right and on the left from this line was symmetric. This allowed us avoiding environmental lateral biases.
We tested each horse individually. The tested horse was led to the starting point by the caretaker and it was let free after halter removal. The experimental design comprised four phases preceded by a familiarization period. In this period, the arena was set as in the experimental phases without the presence of the mirror. Although all horses were accustomed to the covered arena, we started with this familiarization period to exclude the presence of undesirable behaviors (frustration and stress-related behaviors).
The four experimental phases:
Phase 1 (Covered Mirror, CM; day 1). In this phase, the mirror was positioned in the location in which it remained for the whole duration of the study with the reflective surface facing outwards.
Phase 2 (Open Mirror, OM; day 2). The reflective surface of the mirror was turned towards the testing area thus facing the mirror area 1 and 2 (Fig. 1). Therefore, the tested horse could perceive its image in the mirror.
Phase 3 (Sham, S, day 3). A transparent cross-shaped figure (10 cm) was applied on both cheeks of the tested horse (Online Resource 1). The figure consisted of ultrasound water gel (Ultrasound gel, Gima, Milan, Italy). This was necessary to exclude the possibility that the animal's behavior was caused by the tactile or olfactory sensation of the mark rather than the visual mark itself.
Phase 4 (Mark, M; day 4). The cross-shaped figure on both cheeks were colored by adding a small quantity of yellow or blue odorless, hypoallergenic finger paint (F.I.L.A.—Fabbrica Italiana Lapis ed Affini S.p.A., Milan, Italy) to the transparent ultrasound water gel (Ultrasound gel, Gima, Milan) (Online Resource 1). The selection of two primary colors (yellow or blue) to mark the cheeks of the horse was based on horse color perception (Blackmore et al. 2008). To maximize the chromatic contrast and increase the probability that the subject could actually perceive the colored mark as different from the transparent one, we selected blue or yellow eye-shadow powder in relation to coat color (Baragli et al. 2017).
Between two consecutive tests, the mirror surface was cleaned using a hypoallergenic, odorless detergent to limit the body odors of the animal previously tested. Feces were removed at the end of each test.
Each horse was tested at the same time on consecutive days. Each phase lasted 30 min and began when the halter was removed from the tested horse in the starting position.
The marks (both sham and colored) were placed on both cheeks because the panoramic visual field of horses does not cover this head area (Saslow 1999) and, therefore, the mark could be seen by the tested horse only with the guidance of the mirror. The choice to arrange the mark on the cheek also relied on the easiness for the horse to reach that area by the limbs or by the use of environmental supports.
To standardize the marking procedure (size, shape and tactile sensation), we used three identical cross-shaped foam rubber stamps (sham, blue and yellow, 10 × 10 cms, Online Resource 1).
Before each phase, a 10-min grooming session was performed on the whole body to exclude the possibility that the horse felt that it was marked in a specific area (Anderson and Gallup 2015). Fifteen minutes before the SHAM and MARK phases, the caretaker applied the mark (sham, yellow or blue). Concurrently, a repellent substance (Tri Tec, Chifa srl, Angera, VA) was applied on the whole body of the horse to avoid insect disturbance.
During the test, nobody was present in the testing area. Immediately after the release of the horse caretakers moved into the service room where they had the possibility to control the progress of the test by remote control cameras.
Data collection and analysis
From the videos collected in the Covered Mirror and Open Mirror conditions the duration of the selective attention, exploration and contingency behaviors (head movements, look behind, peek-a-boo and tongue protrusion) were extracted. While in the Sham and Mark conditions, the duration of scratching the face (Face-SCR) and the body (Body-SCR) was recorded. The behaviors analyzed and their definitions are reported in Table 1.
The videos collected during each trial were analyzed by one of the authors (C.S.). To check for inter-observer agreement and reliability over scoring, 24 randomly selected 5-min segments of videotapes were assigned to another observer, expert in horse behavior and unaware of the aim of the study (Cohen's kappa was never below 0.87 for each behavioral pattern defined in the ethogram, Table 1).
Via Kinovea (0.8.15 version) and VLC (3.0.6 version) with the plugin Jump-to-Time extension, we analyzed the 22 h of videos collected during the four conditions for each of the tested subjects.
Depending on the data distribution parametric (Kolmogorov–Smirnov test p > 0.05; Paired Sample t test) or non-parametric (Kolmogorov–Smirnov test p < 0.05; Wilcoxon Signed Rank test) tests were applied for the analysis at a group level. For the analysis at an individual level Chi-Square “Goodness of Fit” test (expected frequencies higher than 5.0) was used. Statistical analyses were performed via SPSS (20.0) and VassarStats website (http://vassarstats.net/).