Reversible, ambiguous, or bistable figures (e.g., Rubin’s face/vase picture, the Necker cube, the duck/rabbit picture) have been widely used in research paradigms since the 1800s (see Long & Toppino, 2004, for an overview). When looking at these—objectively stable—pictures, people continuously alternate between two mutually exclusive interpretations (e.g., from a face to a vase and vice versa; Kleinschmidt, Büchel, Zeki, & Frackowiak, 1998; Long & Toppino, 2004). These stimuli have been used to explore numerous perceptual phenomena, including binocular rivalry (Blake & Logothetis, 2002; Meng & Tong, 2004), the influence of cues on perception (Panichello, Cheung, & Bar, 2013, for an overview), the ability of children to switch between the two interpretations (Doherty & Wimmer, 2005; Gopnik & Rosati, 2001; M. C. Wimmer & Doherty, 2011), the brain areas associated with perceptual switches (Britz, Landis, & Michel, 2009; Kleinschmidt et al. 1998; Lumer, Friston, & Rees, 1998; Zaretskaya, Thielscher, Logothetis, & Bartels, 2010), and perceptual hysteresis (Hock et al., 1993). Similarly, studies using picture sets that morph from one unique object (e.g., a rabbit) to another (e.g., a duck), with various levels of ambiguity in between, have shown that pictures are perceived categorically (i.e., as either a duck or a rabbit, but not as an alternate, third object representing the gradual merging of both; Hartendorp et al. 2010; Newell & Bülthoff, 2002; Verstijnen & Wagemans, 2004). Also, it has been demonstrated that the ability to switch between two identities in a morphing continuum can be significantly impaired in autism spectrum disorder (Burnett & Jellema, 2013) and after brain damage to the right hemisphere (Stöttinger et al. 2014). Studies have also used morphing sequences in fMRI studies to investigate the neural correlates of perceptual decisions in a gradually changing environment (Heekeren, Marrett, Bandettini, & Ungerleider, 2004; Thielscher & Pessoa, 2007). Despite the inherent utility of picture morphing sets across a wide range of research domains, from perception to decision making, only a limited number of stimulus sets are available. In addition, there is substantial variation in the quality of the image sets used, making it nearly impossible to compare findings across studies.

Given the potential benefits to a number of areas in psychology of having high-quality, well-normed morphing picture sets, we developed a larger collection of images with a consistent visual appearance. Additionally, we systematically varied whether pictures morphed from an animate to an inanimate object (or vice versa) or morphed within the animate and inanimate classes (Fig. 1). We report here the results of a study measuring normative naming and updating performance for 40 picture sets.

Fig. 1
figure 1

Examples of the four different morphing classes used in the picture-morphing task

Experiment 1

Method

Participants

Two-hundred one participants took part in this study (119 female, 82 male; mean age = 34.63 years, SD = 11.65). The University of Waterloo’s Office of Research Ethics approved the protocol, and all participants gave informed written consent. The participants were recruited through the Amazon Mechanical Turk using CrowdFlower as an intermediary to post the study. Participants received $1 for their participation. The vast majority of participants were North American (93 %) and of a Caucasian/white background (78 %; African American background = 9 %; South Asian = 5 %; Hispanic = 4 %; the remaining 4 % were from First Nations or East or Southeast Asian backgrounds). Administering the Edinburgh handedness test revealed that most of the participants were right-handed (78.6 %), with a few being either left-handed (8 %) or ambidextrous (13.4 %).

Stimuli

Forty picture sets were tested. Each consisted of simple line drawings of commonly known objects. Picture sets displayed objects that either morphed from an animate object into another animate object (N = 10), from one inanimate object into another (N = 10), or from an animate into an inanimate object or vice versa (N = 20; Fig. 1). All of the pictures are available for download at http://tinyurl.com/lew33n6.

All picture sets were created using Morpheus Photo Morpher. Silhouettes of two objects were loaded into Morpheus Photo Morpher (Fig. 2). Pictures were obtained from the Internet or from a collection of hand-drawn pictures by the first author of this article and modified to guarantee smooth morphing (i.e., parts of the pictures were changed prior to the morphing process to maximize the overlap between the two pictures). Markers were placed on and around all key features in each picture to define which features of Object 1 would morph into which features of Object 2 (Fig. 2A). Each marker on Object 1 was associated with another marker on Object 2—represented in the figure by the same color (e.g., the stalk of the pear would become the neck of the violin; hence, the same numbers of markers were used in both pictures). Pictures were morphed over 15 iterations (Fig. 2B). The pictures of the morphed silhouettes were printed out, and outlines were traced manually using tracing paper. These hand-drawn outlines were scanned and the lines were smoothed using the quick trace tool of Corel Draw (Fig. 2C).

Fig. 2
figure 2

Schematic demonstrating the various stages of stimulus generation. Please note that the figure presents a simplified reproduction of the Morpheus interface

All silhouettes were displayed on a white background (316 × 316 or 316 × 315 pixels). The lambda function of Python Imaging Library (PIL) was used to compare the pixel changes between pictures. On average, the amount that each picture was morphed from one iteration to the next was 4 % (SD = 0.17 %; average change in pixels of 4,276.75 pixels, SD = 160.43 pixels; Fig. 2, left panel), with no significant difference between the individual picture positions [F(13, 507) = 0.97, p > .45, η 2 = .07] (Fig. 3, right panel). The average pixel changes for each picture set can be found in Table 1.

Fig. 3
figure 3

Average percentages of pixel change between the pictures [n – (n + 1)], collapsed for all 40 picture sets. The left panel displays the average change per picture; the right panel displays the percentages of pixel change, displayed separately for the different picture positions (Picture 1–Picture 2, Picture 2–Picture 3, etc.)

Table 1 Average numbers of first-object reports for each of the 40 picture sets, displayed separately for the orders of presentation (Object 1 → Object 2 vs. Object 2 → Object 1). The rightmost column displays the difference between numbers of first-object reports depending on the morphing direction. A positive number indicates that participants needed a higher number of images before they reported another object when Object 1 morphed into Object 2, and a negative number indicates the opposite pattern. * Significant difference at p < .05. ** Significant difference at p < .01. bMultiple modes exist. The earliest value is shown

Design and procedure

Questionnaires were designed using Qualtrics. At the beginning of each questionnaire, participants filled out demographic questions and completed the Edinburgh Handedness Inventory (Oldfield, 1971). Each participant was then assigned to one of four different versions of the picture-morphing task, containing 20 picture sets in the order Object 1 to Object 2, or Object 2 to Object 1 (please note that the assignment as Object 1 or Object 2, respectively, was arbitrary). This was done to limit the time that any one participant spent on the task. Each test set included five picture sets of each kind of morphing sequence (animate → animate, animate → inanimate, inanimate → animate, inanimate → inanimate). No participant saw the same picture set twice or in both orders. For each participant, the order of the picture sets within a version was varied using the randomize function of Qualtrics. Participants saw one picture at a time; they were asked to type in the name for each picture and to continue by clicking on a button on the right bottom corner of the screen. The picture was then replaced by the next picture of the same series. Participants were encouraged to use only one word (Fig. 4). As in the procedures of Hartendorp and colleagues (2010), no time restrictions were included, given that the main interest of the study was to obtain identification performance under the most natural conditions.

Fig. 4
figure 4

Screenshot of one picture seen by participants

Analysis

All answers for each picture set were collected and categorized as “1,” reporting the first object, and “2,” reporting the final object. All other answers that were not included in the valid list were coded as “3,” and empty cells were coded as missing values. The categorization was done independently by the first two authors. A comparison of all ratings revealed an interrater agreement of 88 %. Cases of disagreement were discussed and resolved to mutual satisfaction. A Python script was used to transform the written answers into either the number “1,” “2,” or “3” or a missing value. Single omissions that were preceded and followed by an answer within the same category were manually corrected (e.g., if a participant answered “cat–blank–cat,” the omission was changed to “cat” and the corresponding number was assigned). For a complete list of all valid first-object reports for each picture set and both morphing directions, see List_Valid_Names.xlsx at http://tinyurl.com/lew33n6; for the Excel file created with the Python script, see Experiment_1.xlsx following the same link.

Results

Percentages of first-, second-, and other-object reports

Each picture set was viewed by 40 or more participants (range = 42 to 53; mean = 47.61, SD = 3.76). Separately for each of the picture sets and morphing directions, we calculated the percentage of participants who reporting the first or second object or a different object for each of the 15 images (i.e., a complete list of all valid first- and second-object reports for each picture set can be found at http://tinyurl.com/lew33n6). Figure 5 shows the average performance for all picture sets, separately displayed for the two morphing directions. Individual graphs for each picture set and each morphing direction can be found in Appendix A.

Fig. 5
figure 5

Overall percentages of answers in Experiment 1, collapsed for all picture sets. The x-axis represents the gradual morph from the first object (100 % the first object) to the second object (0 % first object). Blue lines represent the percentages of responses identifying the first object, the red line displays the percentages of responses identifying the second object. The green line represents the percentages of responses made indicating a different object—other than the first or second object. The left panel displays the overall performance for all 40 picture sets morphing in Direction 1 (e.g., from an anchor to a hat) The right panel displays the average performance for the same picture sets in reverse order—Direction 2 (e.g., from a hat to an anchor)

The average percentages for each of the 40 picture sets were submitted to a repeated measures analysis of variance with Report Type (first, second, other) and Image Number (15 different morphing images, from 100 % first object to 0 % first object) as within-subjects factors, and Direction of Morphing (Direction 1 vs. Direction 2) as a between-subjects factor. This analysis demonstrated that neither “first” [F(1, 78) = 0.03, p > .86, η 2 = .000], “second” [F(1, 78) = 0.60, p > .60, η 2 = .003], nor “other” [F(1, 78) = 1.35, p > .20, η 2 = .017] reports were significantly affected by the morphing direction.

Updating: Number of first-object reports

Each presentation of an image within a picture set was designated as a trial. The dependent variable in each picture set was the sum of trials in which the participant reported seeing the first object. This number corresponded to the trial number before participants switched—or updated—to a different object. A higher number indicates a longer time to update. The average numbers of pictures that participants needed before they reported another object are displayed in Table 1 for each picture set and both orders of presentation. In 19 of the picture sets, the order of presentation significantly affected the average number of first-object reports (rightmost columns in Table 1).

Participants on average needed 6.65 (SD = 0.99) images before they reported seeing another object. Hence, participants reported the second image slightly before the actual midpoint of the sequence (i.e., Picture 8). An analysis of variance for the four different morphing conditions (animate–animate, animate–inanimate, inanimate–animate, and inanimate–inanimate) revealed a significant main effect of condition [F(3, 597 = 7.33, p < .001, η 2 = .04]. This effect was due to a slightly, but significantly, lower number of first-object reports for inanimate–inanimate shifts (mean = 6.44, SD = 1.25), than for shifts between categories (animate–inanimate: mean = 6.80, SD = 1.19; inanimate–animate: mean = 6.63, SD = 1.23) or within animate objects (mean = 6.73, SD = 1.22; all ps < .05).

No significant difference emerged in the numbers of first-object reports dependent on gender [F(1, 198) = 0.08, p > .75, η 2 < .001] or handedness [F(2, 198) = 0.60, p > .55, η 2 = .006]. We found a small, but significant, negative correlation between the average number of first-object reports and age (r = –.18, p = .01, N = 201).

Experiment 2

Experiment 1 demonstrated consistent naming of the unambiguous images at the ends of each series for all but one picture set. For the mushroom–lamp picture set, fully 34 % of the participants failed to correctly identify the mushroom, even when it was presented as the first object. For the remaining 39 picture sets, participants correctly and reliably identified not only the first and last pictures in each series, but also several pictures after the initial image and several pictures preceding the final image. However, because the pictures were viewed in a series, it is possible that our estimates for consistent naming of the pictures were a reflection of participants being primed by earlier views. In order to have an independent estimate of how consistently each image in each series would be named when it was not presented in the context of other images in a sequence, we ran a second experiment. The participants in this experiment were exposed to a random order of pictures, containing only one picture from each series. This experiment provided normative naming data for our images without prior contextual influence.

Method

Participants

Four hundred ninety-seven participants took part in this study (60 % female, 40 % male; mean age = 39.39 years, SD = 12.50). The University of Waterloo’s Office of Research Ethics approved the protocol, and all participants gave informed written consent. Participants were recruited through the Amazon Mechanical Turk and received US$0.50 for their participation. The vast majority of participants were North American (92 %) and of a Caucasian/white background (73 %; African American = 8 %; South Asian = 9 %; Hispanic = 3 %). Administering the Edinburgh Handedness Inventory revealed that most participants were right-handed (79 %), and a few were either left-handed (8 %) or ambidextrous (12 %).

Stimuli, design, and procedure

Each participant was assigned to one of 15 different versions, each containing one picture from each picture set (i.e., 40 pictures total). Pictures from each set were assigned randomly to each of the 15 different versions. However, in all cases the constraint was enforced that each version contained roughly equal numbers of pictures from each stage of the picture sets (at least one but not more than four pictures from each position within a set). This guaranteed equivalent levels of difficulty for all versions of the task and that none of the versions contained predominantly middle (i.e., ambiguous) pictures. No participant saw more than one picture from each set. The sequence of pictures in each version was randomized once and kept constant for all participants.

The same Python script as in Experiment 1 was used to transform the written answers into either the number “1,” “2,” or “3” or a missing value. The Excel file (Experiment2_xlsx) created by this script can be found at http://tinyurl.com/lew33n6.

Results

Each picture was viewed by 29 or more participants (range = 29 to 41; mean = 33, SD = 3.4). We collapsed the data across participants; for each picture set, we calculated the percentages of first-, second-, and other-object responses (Fig. 6). A comparison of the overall number of “other” reports in Experiment 2 with the overall number of “other” reports in Experiment 1 (collapsed for both morphing directions) revealed a slightly higher number of “other” reports in Experiment 1 (5.93 %, SE = 0.57 %) than in Experiment 2 (4.33 %, SE = 0.43 %) [t(78) = 2.22, p < .05]. Individual graphs containing all three report types for each picture set can be found in Appendix A, rightmost panels.

Fig. 6
figure 6

Overall percentages of answers in Experiment 2, collapsed for all picture sets. The x-axis represents the gradual morph from an arbitrarily defined first object (100 % the first object) to the second object (0 % first object). Blue lines represent the percentage of responses to the first object, the red line displays the percentage of responses to the second object. The green line represents the percentage of responses made indicating a different object—other than the first or second object

For a better comparison between the two experiments and the different morphing directions, we analyzed the percentages of answers depending on the morphing stage of each picture. That is, each image was designated a percentage in terms of how much it represented an arbitrarily defined “first”-reported object (from representing the picture 100 % to representing it 0 %), regardless of the initial direction of morphing (Appendix B, Table 2, Fig. 6). Figure 7 displays the overall percentages of answers, depending on whether the picture was presented as the first or the second object (Exp. 1, Direction 1 vs. Direction 2) or whether the picture series was presented in a random order (Exp. 2). (Appendix B displays the percentages of answers for each picture set and object.)

Fig. 7
figure 7

Average percentages of answers at each morphing stage (ranging from representing the object 0 % to 100 %), collapsed for both morphing directions and experiments. The blue line represents the percentages of answers for objects presented as the first object. The red line represents answer percentages for objects presented as the second object. The green line represents the likelihood (as a percentage) that participants correctly identified objects presented in a random sequence (Exp. 2). The dashed line represents a picture composed equally of both objects (50 % Object 1, 50 % Object 2)

Discussion

Although a wide range of research domains are interested in perception under ambiguous and gradually changing conditions, only a few, highly variable picture sets are available. Given the amount of work that it takes to design and norm such sets, most studies employ only a limited number of picture sets. In addition, very few studies present normative data for all of the image sets used. Here we have developed a high-quality collection of morphing image sets that we make available for researchers with broad interests in perception and decision making. All pictures were presented in both a sequential (Exp. 1) and a randomly scrambled (Exp. 2) order. We found that when the images were presented within a morphing context, participants identified the second object slightly earlier than when the images were presented randomly (Fig. 7). The numbers of “other” reports were comparable in both experiments (although slightly smaller in Exp. 2), indicating that the pictures are perceived as either the first or the second object even when they are not presented in a gradually morphing sequence. Hence, the estimates for consistent naming of the pictures in Experiment 1 were not a reflection of the participants being primed by earlier views.

Due to their comparable perceptual complexities, these picture sets will be useful for EEG and fMRI studies investigating the neural activations associated with visual object representations, as well as for researchers interested in the dissociation between animate and inanimate objects (Caramazza & Shelton, 1998; Konkle & Caramazza, 2013; Kuhlmeier, Bloom, & Wynn, 2004; Mahon & Caramazza, 2009; Martin, 2007; Spelke, Phillips, & Woodward, 1995; Wiggett, Pritchard, & Downing, 2009). Picture sets that morph between categories (animate to inanimate or vice versa) allow one to determine when and how activations shift between brain areas representing the different levels of animacy. More broadly, these picture sets can be used to investigate the neural correlates of identity changes of objects (Valyear, Culham, Sharif, Westwood, & Goodale, 2006) and repetition priming (James, Humphrey, Gati, Menon, & Goodale, 1999, 2000). Thus, this picture-morphing task can contribute to a better understanding of the neural activation patterns representing visual object representations.

These picture sets also provide a valid contribution to a better understanding of the neural correlates of perceptual rivalry using ambiguous or bistable figures, such as Rubin’s face/vase, the duck/rabbit picture, or the Necker cube (Bonneh, Pavlovskaya, Ring, & Soroker, 2004; Britz et al. 2009; Kleinschmidt et al. 1998; Long & Toppino, 2004; Lumer et al. 1998; Zaretskaya et al. 2010). In contrast to ambiguous figures that continuously, spontaneously, and unpredictably alternate between two mutually exclusive interpretations, our picture sets provide more control as to when the switch occurs.

This task is also of potential interest for research in developmental psychology, given the interest in children’s ability to shift between mindsets or perspectives: It has been demonstrated that children younger than 4 years of age have difficulty shifting their mental mindsets (Frye, Zelazo, & Palfai, 1995; Kloo & Perner, 2005; Zelazo, Frye, & Rapus, 1996, for an overview) and that the ability to shift one’s mindset is correlated with performance in the false-belief task (Carlson & Moses, 2001; Frye et al. 1995; Kloo & Perner, 2003; Perner, Lang, & Kloo, 2002; H. Wimmer & Perner, 1983). Consequently, it has been hypothesized that both tasks require an understanding that a situation can vary, depending on the perspective that an agent has (Kloo & Perner, 2005). Hence, the ability to change/update a mental representation of the environment, together with an understanding of how things can change, seems to be a critical milestone that children have to master in their development. Given that our task can be administered in children as young as 3 years old (Stöttinger, Rafetseder, Anderson, & Danckert, 2013), it has great potential to provide valuable insight as to when and how this ability develops.

In the long run, our task also has the potential to serve as a diagnostic instrument. It has been demonstrated that brain lesions can result in selective categorical impairments in perception. Warrington and Shallice (1984), for example, demonstrated a disproportionate impairment for animate objects, whereas other studies have shown the reverse pattern in patients with category-specific agnosias (Caramazza & Shelton, 1998, for an overview). Hence, this task can allow for more sophisticated examination of impairments in those patients. Our task also has the potential to be used to detect the early stages of dementia, given that set-shifting is one of the first nonmemory domains to be affected in Alzheimer’s disease (Perry & Hodges, 1999, for a review). Finally, picture sets that shift between categories could provide a deeper insight into the dissociation between extra- and intradimensional shifts in frontal-lobe (Owen, Roberts, Polkey, Sahakian, & Robbins, 1991) and Parkinson’s (Downes et al. 1989) patients.

We developed a quick and easy task to assess how perceptual representations are updated on the basis of information gathered from the environment. This task can be used among challenging participant populations, including young children, healthy seniors, and brain-damaged patients (Stöttinger et al. 2014; Stöttinger et al. 2013). Given the simplicity and wide variety of the picture sets, the task will be of interest to a broad range of research domains in psychology.