The study consisted of four parts: (1) Familiarisation and ingredient preferences; (2) Affective forecasting test; (3) Control for colour biases in the orangutan’s performance in the AF test and (4) Independent post-experimental measures of taste preferences for ingredients and mixes (see Table 1 for an overview of the study).
One male Sumatran orangutan (Pongo abelii) and ten humans (four females) took part in the study. The orangutan (Naong, born 1990) was 21 years old at the beginning of the study and was housed at Furuvik Zoo/Lund University Primate Research Station Furuvik in Sweden. His enclosure, comprising indoor quarters and outdoor island, was shared with a female of similar age. The female, who was newly arrived at the station and avoided unfamiliar humans, could not be involved in the study. Following the general policy of the research station, the orangutan engaged voluntarily in testing, by entering the experimental room, and was free to disengage at any time. The orangutan was tested across several days, roughly at the same time of the day, about 1–2 h after having had a meal.
The human participants (aged 20–35 years) were recruited and tested at Lund University, in Sweden. The call for participation mentioned the duration of the experiment and that it involved drinking small quantities of liquids, some of which were unpleasant to taste. After signing up, the participants were instructed not to consume any food or liquids prior to or during an experimental session. Participants were tested separately, in individual sessions. They were first acquainted with the set-up and presented with the instructions. The latter specified that the experiment consisted in making a choice between two small amounts of liquid and subsequently drinking (or at least tasting) the chosen liquid. The participants were also informed that they were free to verbalise throughout the experiment if they wished to do so. Finally, they were informed that they were free to quit the experiment at anytime and that their participation would be recompensed with cinema gift certificates. After having had the opportunity to ask questions concerning the experiment, the participants signed informed consent forms.
General procedure and materials
In each study phase, the participants were given a forced-choice task in which they could select between two liquids from a table, by using their hand, finger or a plastic straw. Liquid presentation was counterbalanced with respect to the position on the table. The liquids were presented in small plastic containers, in portions of 10 ml each. Given different testing conditions between the two sites (Lund University/Furuvik Zoo), we employed reusable bottles for the orangutan testing and disposable glasses for the human testing. The bottles and the glasses were comparable with respect to size and volume. In the orangutan set-up, the liquids were briefly presented outside the subject’s reach on a retractable table. The table was then pushed towards the subject so that he could make a choice, by extending a drinking straw (typically held between lips), towards one of the bottles; sometimes finger pointing was used. He was then allowed to drink the chosen liquid while the other bottle was removed from the table. The orangutan consumed the liquid with the help of the straw, through the cage bars. In the human set-up, participants were seated at a table, across the experimenter. The participants were explicitly instructed that as soon as they lifted a glass from the table, this would be recorded as a choice. Unlike the orangutan, they drank directly from the glasses. At both sites, water was freely available. The humans were provided with buckets for discarding non-ingested liquid.
Two experimenters were involved in conducting the orangutan testing—one experimenter prepared the stimuli and the other administered the task. During trial administration, the experimenter was silent and refrained from making head turns or gazing to the left or right, to avoid potential cueing. Only one experimenter conducted the human testing, as testing conditions at Lund University were less demanding.
The ingredient set included cherry juice, rhubarb juice, lemon juice, and diluted apple cider vinegar; this set was derived from an initial battery of seven liquids (see Online Resource 1 for more details on the selection procedure and results). In the orangutan testing, cherry and rhubarb juice were presented in their natural colour—red and pink, respectively. The colour of lemon juice and vinegar, which was similar for the two liquids, was altered to light green and dark green, respectively, by using food dyes. Since some (but not all) of the human participants were familiar with some of the colour–flavour associations (i.e. red-cherry and pink-rhubarb) used in the orangutan testing, a reversed colour scheme was employed in the human testing. Cherry juice was coloured in dark green, rhubarb juice in light green, vinegar in red, and lemon juice in pink. This ensured that all human participants were learning novel ingredient colour–flavour associations. The reversed colour scheme was also employed in part (3) of the study (Control for colour biases), which was administered to the orangutan only. The food dyes used for changing juice colours in the orangutan and human testing had no discernible taste that could have altered juice flavour.
Familiarisation and ingredient preferences
To provide optimal materials for further testing, the aim of this initial phase was to ascertain that the participants were sufficiently familiarised with the ingredients.
Procedure and materials
The trials administered in this phase were instantiated by binary choice trials in which the four ingredients were paired with each other, thus forming six unique ingredient pairs. The human participants received 30 familiarisation trials in which each unique pair of ingredients occurred five times, in blocked trials. To ascertain that participants were sufficiently familiarised with the ingredients, they received an additional 24 trials (four trials/ingredient pair), in which ingredient pairs were presented in randomised order rather than in blocked trials, as previously.
For the orangutan, the preliminary phase of ingredient selection (see Online Resource 1 for more details) served also to familiarise the subject with the experimental ingredient set. After ingredient selection/familiarisation, just like the human participants, the orangutan received 24 randomised trials with the six ingredient pairs. He received an additional 26 such randomised trials in the middle of the AF test, as well as before the colour control.
To ensure that participants were sufficiently familiarised with the ingredients, choice-derived preferences in the blocked trials were compared with choice-derived preferences in the randomised trials. Preference scores were computed as percentages representing the number of times an ingredient was chosen across all occasions in which it was encountered. Individual ingredient preferences did not differ significantly across the two set-ups (all Ps > 0.05, range 0.11–1, Fisher’s exact test). This suggested that all participants had been sufficiently familiarised with the ingredients and had formed stable preferences for them.
Affective forecasting test
In order to probe their AF ability, participants were presented with a task whereby novel choice situations were systematically created by pairing a familiar ingredient with a novel mix, which was obtained by combining two familiar ingredients. By administering this task, we sought to examine how participants responded when confronted with novel juice mixes. More specifically, the aims were (1) to obtain a preference ranking for ingredients and mixes; (2) to assess whether subjects were consistent in their choices; and (3) to rule out the presence of certain biases (novelty, volume). A central prediction of the hypothesis that only humans possess AF is that a non-human animal will exhibit trial-and-error performance upon its first encounters with never-before experienced situations. In the context of our task, this can be measured by assessing whether the orangutan subject exhibits random as opposed to consistent choices across the first and second encounters with each novel ingredient-mix pair. In this assessment, random choices would be indicative of trial-and-error performance. Evidence of choice constancy, on the other hand, would suggest an ability to make principled choices even when confronted with never-before experienced stimuli and contexts. Note, however, that choice consistency is an insufficient criterion for establishing the presence of an ability to make hedonic predictions concerning novel experiences, as non-hedonic criteria might also underlie consistent choices. For example, the orangutan could have chosen based on the novelty of the mixes or showed a bias towards avoiding (or preferentially choosing) the mix. Moreover, given different portion size for the two liquids presented in each AF test trial (as detailed below), the subject could have been biased towards choosing the larger portion.
As in the Familiarisation and ingredient preferences, the participants were administered a binary forced-choice task. By systematically pairing familiar ingredients with novel mixes, 24 novel and unique ingredient-mix pairs were obtained. Each subject received a total of 96 trials in which the 24 ingredient-mix pairs were presented in randomised order. Each unique ingredient-mix pair occurred four times, but typically only once every 24 trials. The task was administered in two conditions: transparent (trials 1–48) and concealed (trials 49–96).
In the transparent condition, the participants had constant visual access to the liquids contained in the bottles. In each of these trials, three bottles, each containing 10 ml of an ingredient, were placed on the table (Fig. 1, Step 1a/b). The content of one bottle was then poured into an adjacent bottle, so that two ingredients were mixed in front of the participants resulting into a novel drink (Fig. 1, Step 2a). The empty bottle was removed from the table and the participants had to choose between 10 ml of a familiar ingredient and 20 ml of a novel mix (Fig. 1, Step 3a).
In the concealed condition, to increase the demands for mental representation in the absence of tangible information, visual access to the stimuli was obstructed before the mix was produced by the experimenter. More specifically, the participants were allowed quick visual access (typically 5–10 s) to the three bottles containing ingredients (Fig. 1, Step 1a/b), after which the contents of the bottles were concealed (Fig. 1, Step 2b). The participants did thus not witness the actual mixing of the ingredients nor did they witness the ensuing mix; they could, however, see that the content of one bottle was poured into another, concealed, one (Fig. 1, Step 3b). After at least 8 s had elapsed from the last visual access to the content of the bottles, the participants were given an opportunity to choose between two concealed bottles (Fig. 1, Step 4b). This set-up prevents learnt colour–taste associations for the mixes from driving choices and constrains the participants to form and keep a representation of the stimuli active in working memory, i.e. beyond the two-second window of sensory short-term memory (as reviewed by Carruthers 2013).
Before engaging in the task, the orangutan received a total of 27 training trials. In 15 of these, it was ascertained that he was able to understand that liquid volume remained equal when poured into a concealed container. These 15 trials were binary choices between familiar ingredients. The remaining 12 trials were aimed at ascertaining that the juice-mixing event—given its salience—would not engender novelty biases for the subject. Non-experimental juices were used in these trials, including three liquids discarded during the ingredient selection phase (blueberry juice, strawberry juice, salt water), and a fourth added one (artichoke). The first six of these 12 trials were binary choices between ingredients (similar to the randomised trials in Familiarisation and ingredient preferences), to determine that the subject recognised them. The last six trials introduced the novel procedure in which binary choices paired a familiar ingredient with a novel mix. The subject did not show a bias for ingredients or mixes, but selected them an equal amount of times.
Test-derived individual preferences for ingredients and mixes In the experimental set of 24 novel ingredient-mix pairs, ingredients and mixes occurred an unequal number of times, with each of the four ingredients occurring more often than the six ensuing mixes. For this reason, individual preference scores for each of the ten liquids were computed as percentages representing the total number of times a given liquid was chosen in the total number of occasions in which it was encountered in the first and second trials for each unique ingredient-mix pair. Individual preference scores and a preference ranking are presented in Fig. 2 for the orangutan and in Fig. 3 for the ten human participants.
Choice consistency Across the first and second encounters with each novel ingredient-mix pair, the orangutan chose identically in 88 % cases (21 of 24 possible pairs), which is significantly different from chance (P < 0.001, binomial test). Choice consistency for the human participants ranged from 71 to 92 % (17–22 constant choices of 24 possible), being significantly different from chance for eight individuals (Ps ≤ 0.02, binomial test) and closely approaching significance for the remaining two (P = 0.06, see Table 2 for more details). To determine if there were cross-species differences with respect to choice consistency, the orangutan’s performance was compared, separately, with the performance of each human participant. We found the orangutan’s performance to be similar to that of humans’ (all Ps ≥ 0.29, Fisher’s exact test).
Choice consistency was further assessed across transparent and concealed trials, as well as within the concealed trials. Across first concealed and last transparent trials for each unique ingredient-mix pair, the orangutan’s level of consistency was 82 % (P < 0.01, binomial test). All human participants but one showed similar high levels of consistency, ranging between 83 and 100 % (all Ps < 0.01, see Table 2 for more details). Within the concealed condition, the orangutan’s level of consistency was 90 % (P < 0.001, binomial test); level of consistency for the ten human participants ranged from 79 to 100 % (all Ps < 0.01, see Table 2 for more details).
Control for volume and novelty biases To rule out the possibility that such biases affected the orangutan’s choices in the AF test, we verified if the orangutan showed a preference for ingredients (or conversely mixes) in these trials. In the first and second trials for each unique ingredient-mix pair, the orangutan chose ingredients in 21 cases and chose mixes in the remaining 27 (P = 0.48, binomial test). Likewise, across all 96 trials that were administered in the AF test, the ratio of mix versus ingredient choices was 55–41, indicating that there was no significant preference for the novel versus familiar type of stimulus nor for the larger volume of liquid (P = 0.18, binomial test). The human participants chose on average 23.5 ingredients (range 17–28) and 24.5 mixes (range 30–31). Separate comparisons between the orangutan and each human participant showed no significant differences concerning choice distribution between ingredients and mixes in the first two encounters with each novel ingredient-mix pair (all Ps > 0.05, range 0.22–1, Fisher’s exact test).
Control for colour biases in the orangutan’s performance in the AF test
Since ingredient selection led to an ingredient set that included exclusively sweet liquids in the red colour spectrum and sour liquids in the green spectrum, it was important to control for the possibility that colour biases affected the subject’s choices. According to the red–green axis hypothesis, primate trichromacy is an adaptation to a feeding ecology that involves the detection of potential food sources food (ripe fruits, young leaves) from the rarely consumed green mature foliage. In line with this hypothesis, human experiments that employ small stimulus sets show that green colouring increases the perceived sourness of stimuli, while red colouring increases their perceived sweetness; such biases, however, are not present when large stimulus sets are employed (e.g. Spence et al. 2010, for a review). A study with Borneo orangutans (Pongo pygmaeus) suggests that colour biases might affect non-human apes as well, since one juvenile individual was found to consume more of the same food when this was coloured in red (Barbiers 1985).
The colour control was administered to the orangutan subject, which, as a representative of a non-human species, is the focal subject of the study. The human participants did not receive a similar control task, since the presence of AF in humans is not contested. Instead, the human participants served as a control group for assessing whether the orangutan’s performance in the key AF test was comparable to that of humans’.
Materials and procedure
To control for the possibility that the subject preferentially chose red juices (and their combinations) over green ones on the basis of their colour rather than their taste, ingredient colours were reversed after the completion of the AF test. Using food dyes, cherry juice was coloured in dark green, rhubarb juice in light green, vinegar in red and lemon juice in pink. Following a brief phase in which original colour–flavour associations were extinguished (see Online Resource 2 for more details), the subject received 36 trials in order to establish choice-derived preferences for the ingredients presented in reversed colours. These preferences were compared with ingredient preferences derived from choices in the preliminary phase, when ingredients were presented in their ‘original’ colour. The procedure was similar to the one in the last phase of Familiarisation and ingredient preferences. Each of the six possible ingredient pairs were presented in randomised order and occurred six times.
A comparison of choices of the ingredients presented in their original colour with choices of ingredients presented in reversed colours revealed no significant differences across the two stimulus variations (P = 0.59 Fisher’s exact test). Indeed, in the ‘original’ ingredient preference trials the orangutan chose sweet drinks in 76 % of the trials, while in the trials with reversed colours he chose sweet drinks in 83 % of the trials. The results indicate that subject’s choices in the AF test were not affected by colour biases in line with the red–green axis hypothesis.
Summing up the results thus far, we established that the orangutan performed non-randomly when presented with novel mixes and novel choice contexts and that his performance was within the range of that shown by the humans. We further ruled out the possibility that certain non-hedonic criteria—including novelty, volume or colour—underlie his consistent choices in the first encounters with novel mixes and novel choice contexts.
Independent post-experimental measures of taste preferences for ingredients and mixes
The aim of this final part of the study was to determine if participants’ choices when presented with novel mixes (in the AF test) were motivated by hedonic forecasts, i.e. by how the mixes were predicted to taste. For this purpose, separate measures of taste preferences were obtained from the participants, in the absence of additional task demands, such as ingredient mixing. These were then compared to choice-derived preferences in the first and second encounters with the novel ingredient-mix pairs in the AF test. Finding a relationship between the two preference measures would indicate that participants’ performance in the AF test was supported by a mental process that maximised the likelihood of selecting the most pleasant outcome.
Procedure and materials
An independent preference ranking for the four ingredients and the six ensuing mixes was obtained from the human participants by means of self-report. More specifically, they were asked to rank the ten liquids from most to least preferred. This also allowed us to corroborate taste preferences based on behavioural responses (i.e. participants’ choices in the AF test), with self-reported preferences after task completion, i.e. after novel juices have been experienced several times. This procedure parallels a commonly employed approach in AF research, whereby self-reports of predicted hedonic outcomes for certain events are compared with self-reports of experienced hedonic impact of those events.
The orangutan was presented with a new set of binary choices in which ingredients and mixes were contrasted pairwise in blocked trials. Crucially, in the post-experimental preference trials, the ten juices were presented in ‘disguise’. The ingredients were reversed to their original colour, and the mixes were randomly assigned new colours, such as yellow (lemon–vinegar), orange (cherry–rhubarb), light blue (rhubarb–lemon), dark blue (cherry–lemon), brown (cherry–vinegar), and milky green (rhubarb–vinegar). Furthermore, the mixes were presented pre-blended, thus taking the appearance of novel ingredients. Liquids in a pair were now presented in equal portions of 10 ml each. Prior to administering the first trial of each block, the subject was allowed to sample each liquid in the respective pair. There were typically five trials in each block, so that each unique pair of liquids occurred typically five times. A preference ranking was then derived based on scores representing the percentage of times a stimulus was chosen across all the pairs in which it occurred.
To verify that hedonic predictions guided participants’ choices in the AF test, choice-derived preferences in the first two encounters with each novel ingredient-mix pair were compared with post-experimental preferences. The latter are summarised in Fig. 2 for the orangutan and Table 3 for the human participants. As this comparison relied on a small set of categorical data and tied ranks were expected, Kendall’s tau-b correlation coefficients were computed to establish whether the two preference measures were related (e.g. Agresti 2010). We found the orangutan’s preferences in the first two encounters with each ingredient-mix pair in the AF test to correlate highly and significantly with post-experimental preferences (τ
= 0.67, P = 0.01, N = 10); a similar result was found for choices in the concealed trials (τ
= 0.68, P = 0.008, N = 10). Collapsing ‘transparent’ and ‘concealed’ trials (i.e. all 96 test trials), we found task choices to correlate highly and significantly with post-experimental choices (τ
= 0.71, P = 0.006, N = 10).
Similarly, for the human participants, test-derived preferences in the first two encounters with each novel ingredient-mix pair correlated highly and significantly with self-reported preferences, with correlation coefficients ranging from τ
= 0.52 (P = 0.04, N = 10) to τ
= 0.94 (P < 0.001, N = 10, see Table 4 for more details). Likewise, choice-derived preferences in the concealed trials correlated significantly with self-reported preferences: τ
ranged from 0.54 (P = 0.04, N = 10) to 0.89 (P < 0.001, N = 10).
For the orangutan data, two Bradley–Terry models (Bradley and Terry 1952) were further implemented in order to estimate the predictive accuracy of the hypothesis that choices in the first two encounters with the novel ingredient-mix pairs were driven by hedonic predictions. This statistical approach is often applied to pairwise comparison data for the purposes of individual preference modelling. The assumptions of a Bradley–Terry model are that the data consist of paired choices and that, for each choice, the probability of choosing one item over the other depends on the subjective value of that item compared to the other item. This value is an unknown parameter that is estimated using the data. The two Bradley–Terry models were estimated using the bbmle package for R (Bolker 2008) where the difference between the models consists in how the subjective values are assigned. In model A, subjective values for each ingredient or mix were estimated based on the assumptions that subjective values did not change across relevant trials, i.e. first and second trials with each ingredient-mix pair in the test and post-experimental trials. Model B extended model A by estimating separate values for the first two times a specific novel pair was encountered and for the rest of the trials. Model A is consistent with the assumption that choices in the first two encounters with each ingredient-mix pair were guided by predictions concerning taste preferences. Model B, on the other hand, would better fit that data if the test trials examined were not driven by predicted taste preferences, thus differing from post-experimental choices. Three measures were used to compare the two models and all pointed to model A as being a better fit than model B, thus favouring the model assuming that hedonic predictions explain choices in the examined trials. A comparison of the two models using the Akaike information criterion (Akaike 1981) favoured model A (AIC = 143) over model B (AIC = 152), as did a comparison using the Bayesian information criterion (Schwarz 1978) with model A having a BIC of 173 and model B having a BIC of 202. Further, a likelihood ratio test showed no statistically significant improvement of using model B over model A [χ
2(6) = 2.64, P = 0.85].