When familiar objects are rotated away from the most familiar view, such as the canonical upright orientation, the time taken to name the object increases relatively systematically as a function of the angle of rotation (e.g., Hamm & McMullen, 1998; Jolicoeur, 1985; McMullen & Farah, 1991; McMullen & Jolicoeur, 1990; Tarr & Pinker, 1989). Such findings have generally been interpreted to mean that objects are represented in a view-based manner, so that their identity is not established until the perceived view is matched to the stored view (which is presumed to be upright) through some form of spatial transformation or interpolation (Bülthoff & Edelman, 1992; Edelman & Bulthoff, 1992; Leek, Atherton, & Thierry, 2007; Tarr, 1995; Tarr & Pinker, 1989; Ullman, 1989). Inherent in this approach is the idea that an object’s viewpoint or orientation is an integral part of the way the object is represented—that is, an object in a particular orientation (e.g., “an upside-down chair”).

Other findings, however, suggest at least some degree of independence between the representation of an object’s shape (and identity) and its orientation. Such a dissociation is demonstrated by patients with “orientation agnosia,” who can recognize objects in a variety of orientations, but are unable to determine their orientations (Cooper & Humphreys, 2000; Fujinaga, Muramatsu, Ogano, & Kato, 2005; Harris, Harris, & Caine, 2001; Turnbull, Beschin, & Della Sala, 1997; Turnbull, Laws, & McCarthy, 1995). Further evidence suggestive of independent coding of object identity and object orientation comes from studies that investigated object recognition under rapid serial visual presentation (RSVP) conditions. When visual stimuli are presented at a rate of 10–12 items/s in the same spatial location, observers frequently experience repetition blindness (RB)—that is, they are much poorer at detecting that an item has been repeated in the sequence than at detecting nonrepeated items. This effect also occurs when the repeated items (objects or letters) are presented in different orientations, and the degree of RB does not vary systematically according to the degree of rotation between the repeated versions (Corballis & Armstrong, 2007; Harris & Dux, 2005a, 2005b; Hayward, Zhou, Man, & Harris, 2010). RB has been attributed to a difficulty in registering the two repeated stimuli as separate instances (tokens) of a commonly activated type representation in memory, and it is thought to indicate recognition without conscious awareness—that is, the visual system is sensitive to the fact that this object has already been encountered, but no conscious representation of it is formed (Kanwisher, 1987; Morris, Still, & Caldwell-Harris, 2009). The findings summarized above suggest that activation of type is independent of orientation.

A similar conclusion was reached in a study that examined the effects of orientation at different stages of object processing (Harris, Dux, Benito, & Leek, 2008). In one experiment, observers were required to name a target object presented in its usual upright orientation, which was preceded by a masked prime depicting either the same object or a different object, in a range of picture-plane rotations. The prime duration varied between 16 ms and 350 ms, and the observers were asked to ignore it. Significant priming (i.e., faster naming for target objects preceded by the same object prime than by a different object prime) occurred when primes were displayed for 70 ms or longer, and this priming was equivalent across all prime orientations from the earliest time that yielded priming. In a second experiment, observers saw the same rotated primes for an extended period of time and named them. The naming times themselves increased systematically with the degree of rotation from the canonical upright, consistent with the large literature on viewpoint-dependent naming effects (e.g., Hamm & McMullen, 1998; Jolicoeur, 1985; McMullen & Farah, 1991; McMullen & Jolicoeur, 1990; Tarr & Pinker, 1989). These findings were taken as evidence that the initial activation of object representations in memory is orientation invariant—possibly mediated by local shape features or parts—and the object’s specific orientation is integrated at a later stage of representation when objects are consolidated in visual-short-term memory (see also Dux & Harris, 2007; Harris, Benito, & Dux, 2010). Several authors have suggested that determining the orientation of an object requires a comparison between the current percept and a stored representation of the object that specifies its usual orientation (Corballis, 1988; Harris et al., 2001; McCloskey, 2009). For example, this can take the form of a spatial vector that specifies the distance between the principal axis of the current instance of the object and the axis of the representation of the object stored in memory (McCloskey, 2009; McCloskey, Valtonen, & Cohen Sherman, 2006). The derived vector may be viewed as another object attribute, like color, that varies with viewing instances. Thus, according to the proposal outlined above, the orientation of the object at a particular time would be bound to its identity in visual short-term memory in order to deliver the percept of the viewed object.

One limitation of these previous studies is that the coding of orientation was only tested indirectly, by examining accuracy and speed of identification across different orientations of the objects. This makes it difficult to know whether information about an object’s orientation is truly dissociated from the object’s identity or whether it is simply ignored because it is irrelevant to the task. Therefore, in the present study, we sought direct evidence about the representation of object orientation and its relationship to the object’s identity. Are these bound together in a unified representation during perception and encoding in working memory, or are they represented independently in parallel? And if the latter, does one feature receive priority processing?

There is a rich literature on the “binding problem” (the question of how separate features of an object are integrated into a single object representation) that has preoccupied researchers for decades. Perhaps the most influential cognitive theory of binding is the feature integration theory (FIT) proposed by Treisman and Gelade (1980). According to FIT, different features of an object, such as its shape and color, are initially registered in parallel in specialized feature maps, with features that co-occur at a spatial location being bound together through a serial deployment of spatial attention (Treisman, 2006; Treisman & Gelade, 1980). This notion was later expanded into the concept of object files, which are essentially bundles of object features indexed by a spatiotemporal tag and which serve as the middle ground between raw features and long-term memory representations (Kahneman & Treisman, 1984; Kahneman, Treisman, & Gibbs, 1992). Many studies have since confirmed an important role for spatial location in feature binding (e.g., Friedman-Hill, Robertson, & Treisman, 1995; Kovacs & Harris, 2019; Pertzov & Husain, 2014; Robertson, Treisman, Friedman-Hill, & Grabowecky, 1997; Schneegans & Bays, 2017; Treisman & Zhang, 2006). For example, some studies have shown that features are only linked via their shared location rather than being directly bound to each other (Kovacs & Harris, 2019; Schneegans & Bays, 2017); that observers automatically have access to the location of a probed feature, but not to a feature present at a probed location (Chen & Wyble, 2015); and that impairments in spatial abilities can result in binding errors between features of different objects (Robertson et al., 1997). Further evidence consistent with this notion also comes from a recent study that looked at the binding between the orientation and color of simple line stimuli (Pertzov & Husain, 2014). In this study, stimuli were presented sequentially in different colors and orientations, and observers had to report the orientation of the bar of a precued color. Pertzov and Husain (2014) found that observers often incorrectly reported the orientation of a different bar presented in the sequence when the bars shared the same location, but not when they shared another feature, such as color, and were presented in different locations. Thus, spatial location seemed to be important for maintaining the correct binding between features, likely because it facilitates the formation of separate object files.

Another question that has generated a fair amount of interest is whether features that are bound through common location during perception remain bound together in memory. Early findings argued that objects are maintained as integrated objects in visual short-term memory, with the capacity of this memory store being limited by the number of objects, rather than the number of features making up these objects (Luck & Vogel, 1997). However, other studies claimed that the bindings between features are not maintained in memory without sustained attention to the remembered material (Horowitz & Wolfe, 1998; Wolfe, 1999). Wheeler and Treisman (2002) argued for a middle-ground position, whereby features coded along different dimensions are stored independently, but the binding between these features is also maintained if it is task relevant. Specifically, Wheeler and Treisman found that maintenance of the bindings was particularly taxed when the test displays contained multiple stimuli presented in different locations, which they argued required spatial memory resources that would otherwise be necessary to maintain the feature bindings. In contrast, the bindings were maintained successfully if the test display consisted of only one stimulus, or required free recall of the feature present at a particular location, meaning that a spatial comparison between the initial and test displays was not necessary.

Most studies that investigate feature binding use arbitrary combinations of primitive features, such as a color, location, (basic) shape or orientation of lines. It stands to reason that attention may be necessary to maintain such arbitrary bindings. But what about meaningful, familiar visual objects? How are they perceived and remembered when they are presented in different orientations, which can immediately be judged to be “wrong” (i.e., not the canonical orientation of the object)? As outlined above, there is considerable evidence that object recognition is viewpoint dependent in a manner that would be consistent with the notion that the object’s orientation is fully integrated with its shape (e.g., Hamm & McMullen, 1998; Jolicoeur, 1985; McMullen & Farah, 1991; McMullen & Jolicoeur, 1990; Tarr & Pinker, 1989). On the other hand, the RSVP studies summarized earlier suggest that when attention is taxed, the identity of the object can be represented in an orientation-invariant manner (Dux & Harris, 2007; Harris et al., 2010; Harris & Dux, 2005a, 2005b; Hayward et al., 2010), perhaps because attentional resources are necessary to maintain the binding between the object’s identity and its orientation.

To test this, in the present study we solicited explicit judgements of object orientation, as well as evidence of object identification, for objects that were briefly presented. Specifically, we asked participants to judge whether the object was rotated away from its usual upright (canonical) orientation by 90° (to the left or to the right) or by 180°. We reasoned that if objects are represented perceptually as bound units of shape and orientation (i.e., a view-based representation), then identifying an object would automatically provide access to information about its current orientation. Conversely, if the identity and the orientation of an object are represented independently, then we should see a dissociation between knowing what an object is and knowing its orientation. We might even see incorrect bindings or “illusory conjunctions” (Treisman & Schmidt, 1982) between the identities and orientations of objects presented in close temporal proximity, whose representations would be active at the same time, similar to the findings reported by Pertzov and Husain (2014).

We ran four experiments in which we presented two different rotated objects very briefly and curtailed their processing by masking them with a forward and a backward mask. Participants were cued with an upright object and had to report whether that object was present on that trial or not (yes/no response) and to indicate its orientation relative to the canonical upright, guessing if necessary (e.g., if they had not seen the object). All objects had a usual canonical upright orientation, but were presented in one of three alternative orientations (rotated 90° clockwise, 90° counterclockwise, or 180° from the correct upright orientation); the objects were never shown upright, so as not to be visually identical to the cue, which would allow identification on the basis of low-level image cues. To test whether location influenced the binding of object shape and orientation, in Experiments 1 and 2 the two objects were presented sequentially at fixation for 70 ms each, whereas in Experiment 3 the two objects were presented simultaneously for 70 ms to the left and right of central fixation. In Experiment 4, the two objects were also presented simultaneously in different locations, as in Experiment 3, but for double the amount of time (140 ms), to ensure that the participants had the same total amount of time to view the object as they had in Experiments 1 and 2.

To preview the results, in all four experiments we found that participants had incomplete information about the orientation of the objects they had identified correctly—only reporting the correct orientation on about 70% of the trials—and they were at chance in reporting the orientation when they had not recognized the object. Both of these results are consistent with the idea that object identity and orientation are not bound together in a fully integrated view-based representation. The object’s orientation appears to be determined after object identification and to require knowledge of the object’s identity. Furthermore, in Experiments 1 and 2—where the two objects were presented in the same spatial location—when participants made an orientation error they were more likely to report the orientation of the other object presented on the same trial than an unseen orientation, reflecting misbindings of object identities and orientations. This susceptibility to binding errors was not present when the two objects were in different spatial locations, irrespective of the exposure duration of the objects.

Experiment 1

Method

Participants

Twenty-four first-year undergraduate psychology students participated for partial course credit. Participants provided informed consent, and the experimental procedures were approved by the Human Research Ethics Committee of the University of Sydney.

Apparatus and materials

The experiment was programmed in Presentation (Neurobehavioral Systems; www.neurobs.com) and was presented on a 19-in. Dell Trinitron CRT monitor refreshing at 85 Hz. The stimuli consisted of 60 line drawings of objects with a well-established canonical orientation taken from the Snodgrass and Vanderwart (1980) corpus and included objects from various categories (see Appendix). These objects’ canonical upright orientation was unambiguous in every case, so determining whether the object was upright, upside down, or rotated by 90° would be a trivial task if participants had sufficient time to view the stimuli. Indeed, using similar stimuli, Harris and Dux (2005a, Experiment 3) have shown that with exposure duration as short as 100 ms, judging a single objects’ orientation in this manner is achievable with 80%–95% accuracy. The objects subtended approximately 7° of visual angle and were viewed from a distance of approximately 45 cm from the monitor, although viewing distance was not fixed. Pattern masks were generated by creating random shapes drawn in the same line thickness as the objects (see Fig. 1). All stimuli were black against a white background.

Fig. 1
figure 1

Examples of the trial structure in Experiment 1, for the Before task (left panel) and the After task (right panel). In the Before task, a picture cue was presented for 1 s in the center of the screen, followed by a short RSVP sequence consisting of two objects presented in different orientations, which were preceded and followed by masks. In the After task, the cue occurred after the RSVP stream. All stimuli were presented sequentially at fixation, with the objects shown for 70 ms each and the masks for 106 ms each. Participants decided whether the cued object was present in the stream and then reported its orientation

Experimental design and procedure

All trials contained two objects that were presented in one of three orientations relative to the upright: 90° clockwise, 90° counterclockwise, or 180°, with the two orientations on a trial always being different from each other. Thus, there were six orientation combinations, generated by the three combinations of orientations crossed with two possible orders (e.g., 90°cw then 180° vs 180° then 90°cw). Each pair of orientations occurred an equal number of times across the experiment. Objects were randomly assigned to these orientations on each trial by the computer.

The participants were divided into two groups (N = 12 each) who completed different versions of the experiment. In one, the cue object was presented before the RSVP stream (Before task) and in the other one the cue object was presented after the RSVP stream (After task). For the Before task, each trial began with the presentation of the cue object, shown in its usual upright orientation, for 1 s, along with the questions “Present/Absent?” and “Orientation?” printed underneath (see Fig. 1, left panel). This was followed by a rapid sequence of stimuli presented in the center of the screen, consisting of a forward mask presented for 106 ms followed by the two objects, each presented for 70 ms, and then a backward mask for 106 ms. Participants made two unspeeded responses: Yes/No the cue object had been present in the stream, using two keys on the keyboard (the + and the Enter key on the number pad) labeled with Y and N, and then indicated the orientation of the target item relative to its canonical upright using the arrow keys on the number pad. They were asked to guess an orientation, even if they thought the object had not been presented, and were allowed to nominate the upright orientation as a possible response (this was done in order to see whether participants defaulted to this when they were guessing an orientation). For the After group, the trial was structured in exactly the same way, except that the rapid sequence of stimuli occurred first, followed by the cue object and questions (see Fig. 1, right panel). Thus, this version of the task relied more heavily on short-term memory and required the participants to keep in mind both objects, as they did not know in advance which one they should look out for. For both versions there were 240 target-present trials and 120 target-absent trials, equally distributed across the six orientation combination conditions. All conditions were randomly intermixed.

Results

Object-detection accuracy

On average, participants correctly reported the presence of the target object on 81.21% of target-present trials in the Before task and on 69.00% of target-present trials in the After task (see Table 1 for hits and false-alarm rates). Although this difference in hit rates was significant (p = .003), the two groups did not differ in terms of their d’ measures (Before group, 1.91; After group, 1.82; p = .55). This indicates that the After group were not less sensitive to the presence of the object, but rather were less confident in reporting that the object had been present on that trial.

Table 1 Hit, false alarm rate, and d’ (SD) for target identification in the four experiments

Orientation responses on target-present trials

Figure 2a shows the distribution of orientation responses in the Before and After tasks, plotted as a function of the object-detection responses (hits vs. misses). Accuracy was higher on object-detection hit trials than on miss trials (69.87% vs. 33.37% in the Before task, and 55.20% vs. 28.54% in the After task). In order to compare whether these accuracy rates represent a difference in the observers’ sensitivity to the object’s orientation, we calculated d′ for the orientation reports, using the probability of reporting either of the two orientations present in the stream in the correct rejection trials and dividing this probability by 2, to estimate the rate of producing orientation “false alarms” when the observers were aware that the object had not been presented and were merely guessing an orientation. These d′ values are shown in Fig. 2b and were analyzed with a two-way ANOVA, with object detection (hit vs. miss) as the within-subjects factor, and task (Before vs. After) as the between-subjects factor. Sensitivity was significantly higher on hit trials (d′ = 0.90) compared with missed trials (d′ = 0.05), F(1, 22) = 113.81, p < .001, ƞp2 = .838, and higher in the Before task (d′ = 0.57) than in the After task (d′ = 0.38), F(1, 22) = 7.99, p = .01, ƞp2 = .266; these factors did not interact, F(1, 22) = 2.59, p = .122, ƞp2 = .105); d′ was significantly above zero (i.e., indicating some sensitivity to orientation) on hit trials for both the Before and After tasks (ts > 8.73, ps < .001), but d′ was no different from zero on missed trials in either the Before or the After tasks (ts < 1.41, ps > .176). Taken together, these results indicate that the ability to judge an object’s orientation is heavily dependent on having identified the object.

Fig. 2
figure 2

a Percentage of different types of orientation responses in Experiment 1, plotted as a function of task (cue object presented before or after the RSVP stream) and object-detection response (hits vs. misses). Correct (black) = the orientation was reported correctly; distractor (gray) = the orientation of the other object was reported; absent (white) = an orientation that was not presented on the trial was reported; upright (striped) = the object was reported as being upright. * indicates a significant difference between reporting the orientation of the distractor object versus an absent orientation, p < .001. b Sensitivity to the correct orientation (d′) on object-detection hit and miss trials in the Before and After tasks

Next, we looked at the types of errors made by the participants, which are also displayed in Fig. 2a, plotted as a function of the object-detection response and task. These errors include reporting the orientation of the other object present on that trial (distractor) or reporting an orientation that was not present on that trial (absent or upright; NB: we distinguish between these because upright was not an orientation that was ever presented during the experiment, whereas the absent orientation, while not presented on that particular trial, was nevertheless a possible orientation in the experiment).

For object-detection hit trials, a 3 × 2 mixed ANOVA, with orientation response as the within-subjects factor and task as the between-subjects factor, yielded significant effects of orientation response, F(2, 44) = 10.97, p < .001, ƞp2 = .333, and task, F(1, 22) = 10.47, p < .001, ƞp2 = .323, as well as a significant interaction, F(2, 44) = 4.47, p = .02, ƞp2 = .169. We next analyzed the distribution of these responses separately for each task (Before, After), using two orthogonal planned contrasts: one that compared upright responses to the other two “experiment-present” orientations (i.e., distractor and absent), and one that compared the distractor (present on that trial) and the absent (not present on that trial) orientation responses. Observers gave significantly fewer upright responses compared with the other two responses in the Before task, F(1, 11) = 24.32, p < .001, ƞp2 = .689, indicating some sensitivity to the fact that objects were never presented upright. However, this was not the case in the After task, where observers made greater use of the upright response, such that they were not less likely to give that response compared with the other two, F(1, 11) = .01, p > .9. Intriguingly, observers were significantly more likely to report the orientation of the distractor rather than the absent orientation, both in the Before task, F(1, 11) = 40.34, p < .001, ƞp2 = .786, and in the After task, F(1, 11) = 31.70, p < .001, ƞp2 = .742. That is, they tended to misbind the orientations and the object identities presented in rapid succession.

For object-detection miss trials, the same 3 × 2 mixed ANOVA, with orientation response and task as factors, yielded a significant main effect of orientation response, F(2, 44) = 6.76, p = .003, ƞp2 = .235, but no effect of task, F(1, 22) = 2.41, p = .131, ƞp2 = .101, or interaction between response and task (F < 1). The planned orthogonal contrasts comparing the proportions of orientation responses revealed that, similar to the hit trials, observers were significantly less likely to give an upright response compared with the other types of responses in the Before task, F(1, 11) = 7.38, p = .02, ƞp2 = .402, but that in the After task the proportion of upright responses was not different from the other two responses, F(1, 11) = 2.03, p = .182, ƞp2 = .156. Contrary to the pattern of responses on the hit trials, on these miss trials there was no difference between the proportion of distractor and absent orientation responses, regardless of task; F(1, 11) = 2.05, p = .18, ƞp2 = .157, in the After task, F < 1 in the Before task. These results suggest that when observers have failed to identify the object, they are likely to be guessing orientations randomly.

Discussion

This experiment yielded several interesting results. The first is that even when participants correctly identified a cued object from a briefly displayed sequence, they reported its orientation correctly less than 70% of the time; in fact, when the cue object was presented after the test stimuli (After task), orientation accuracy was only 55% correct. Thus, it is possible to know what an object is without necessarily knowing its orientation, in line with the findings of orientation agnosia following brain damage (Fujinaga et al., 2005; Harris et al., 2001; Turnbull et al., 1997) .

The second result of this experiment is that participants nevertheless demonstrated some sensitivity to the orientation of the identified objects, as their d′ measures were significantly above chance. In contrast, when they failed to identify the cued object (i.e., on the missed trials), they appeared to be guessing the orientation, as d′ was no different from zero. This suggests that knowing the orientation is dependent on having first identified the object (though clearly not the other way around).

The third result is that when they made an orientation error on object-detection hit trials, observers were most likely to respond with the orientation of the alternative object presented on that particular trial, compared with an absent orientation. This pattern of binding errors was present both in the Before task, where every single subject demonstrated this bias, and in the After task, where it was shown by 8 out of 12 subjects. Participants in the Before task were also less inclined to give an upright orientation response, an orientation that was never presented during the experiment, compared with one of the orientations that were actually possible. Together, these findings suggest that observers had some knowledge of the orientations of the objects present on any given trial, but they made frequent binding errors between the objects’ identities and orientations.

One potential concern with the asymmetry between the object-detection accuracy and the orientation judgement accuracy is that it might be an artifact of the task structure, rather than indicating that observers did not have orientation information about the objects they had identified correctly. Specifically, the object identification task was essentially a two-alternative forced choice (target present vs. target absent), while the orientation judgement task was a four-alternative forced choice. Therefore, observers could get the object identity correct 50% of the time simply by guessing. On these “lucky guess” (or false hits) trials, they would not have any real information about the orientation, but they could guess the correct orientation 25% of the time (since there are four possible orientation responses). Hence, is it possible that the lower accuracy for orientation is due to the lower chance of guessing the orientation correctly, compared with guessing the identity correctly on trials in which the object was not in fact detected, but the participant made a lucky guess? To check this, we first calculated the true-hit rate for object detection, using the formula Observed Hits = True Hits + False Hits, where the false-hit rate can be estimated by applying the false-alarm rate to target-present trials on which the subject has not made a true hit (i.e., 1 – true-hit rate; since the observer cannot simultaneously make a true hit and a false hit on the same trial). From this, we can derive the formula True Hit rate = (Observed-Hit Rate – False-Alarm Rate)/(1 – False Alarm Rate). For the Before task, using this formula with the observed-hit rate of 81.21% and a false-alarm rate of 17.70% yields a true-hit rate of 77.2%, meaning that 4% of the observed hits could just be lucky guesses. If we assume that observers have access to a bound representation of identity and orientation and thus know the orientation in every case when they have identified the object, and that they correctly guess the orientation on one fourth of the false-hit trials, this should produce a maximum correct orientation rate equal to the rate of True Hits + the rate of False Hits/4, as a proportion of observed hits. These calculations yield a maximum correct orientation rate of (77.2 + 4/4)/81.21 = 96.3%. The observed orientation accuracy rate on hit trials of 69.9% is clearly far below this value. Similar calculations for the After task yield an estimated true-hit rate of 63.57%, meaning that 6.4% of observed hits could have been lucky guesses. In this case, if the orientation was known for all true hits, the estimated orientation accuracy for hits should be 94.1%, which again is well above the observed value of 55.2%. These calculations demonstrate that our results cannot be accounted for by guessing.

As might be expected, the performance in the after task, where participants did not know in advance which object to look out for, was generally poorer than in the before task. Although sensitivity for detecting the target object was not in itself any different, the information that participants held about the objects’ orientations was clearly more fragile in the after task. For one, the overall accuracy of orientation reports for the identified objects was significantly lower (58% compared with 69% in the before task). Additionally, participants were somewhat less likely to favor the distractor orientation over an absent orientation, and they were more likely to give an upright response (15.2% of the object-detection hit trials, compared with 4.9% in the before task). Thus, it appears that orientation information gleaned from these briefly presented objects is quite fleeting, and this causes participants to be more likely to default to the upright orientation, as this was the last orientation they had seen when presented with the cue at the end of the trial—and, of course, it also matches the canonical orientation stored in memory.

This experiment produced some interesting results, but the conclusions that we can draw from them are somewhat complicated by the fact that the response choices and the number of actual orientations present were not equated. That is, there were three possible orientations (90° clockwise, 90° counterclockwise, and 180°) and four response choices (including 0°, upright). This makes it tricky to calculate guessing rates, because it is not clear what constitutes chance performance and participants’ use of the upright response seemed to be itself influenced by the nature of the task. For this reason, in Experiment 2 we sought to replicate these findings in a more tightly controlled fashion, using only the three response alternatives corresponding to the actual presented orientations and only using the Before task, in order to minimize reliance on memory.

Experiment 2

Method

Participants

Thirteen new participants from the same pool took part in this experiment. We initially aimed for 12, to provide a replication of the before task in Experiment 1, but data from one additional participant was collected due to a scheduling error, and we decided to keep all data sets.

Procedure

This experiment consisted of the Before task employed in Experiment 1, with the only difference being that participants were explicitly told that the objects were only ever presented in three possible orientations, and they should use only these options as their orientation response.

Results and discussion

Object-detection accuracy

On average, participants correctly reported the presence of the target object on 80.84% of target-present trials (see Table 1). They had a false-alarm rate of 22.32% (d′ = 1.86). Thus, detection accuracy was very similar to that in Experiment 1.

Orientation responses on target-present trials

Mean accuracy for judging the orientation of the target object when it was present was significantly higher when the object had been successfully detected (i.e., hit trials accuracy = 67.6%) than when the object was missed (36.5%), t(12) = 6.09, p < .001 (see Fig. 3, left panel). The proportions of correct orientation responses replicate very closely the results of Experiment 1. The current experiment’s structure enabled us to determine unambiguously that the accuracy for reporting the orientation of a missed object was no different from chance (33.33%, since there were three possible orientations and orientation responses used in the experiment; p = .155).

Fig. 3
figure 3

Percentage of orientation responses (correct orientation [black] vs. distractor’s orientation [gray] vs. an absent orientation [white]) plotted separately for the hits, misses, correct rejection (CR), and false alarm (FA) object-detection responses. The left panel shows results of Experiment 2, the middle panel shows results of Experiment 3 and right panel shows results of Experiment 4. * and # indicate significant differences (*p < .001; #p < .05)

Participants gave significantly more correct orientation responses when objects were rotated by 180° (38.12% of all correct responses) than when they were rotated by 90° clockwise (30.13%) or 90° counterclockwise (31.74%), ps < .02; whereas the accuracy rates for the two 90° orientations did not differ from each other. One possible reason for this pattern of results is that the two 90° orientations are more difficult to discriminate from each other; for instance, the observer might have partial information about orientation and know that the axis of the object is rotated by 90°, but not know the polarity of that axis—that is, whether the top of the object points left or right. To test whether this could explain the results, we looked at whether observers were more likely to respond with “the other 90°” orientation when the target was rotated by ±90° than with “180°.” In other words, if the target object was rotated by 90° clockwise, were observers more likely to respond 90° counterclockwise (and vice versa) than 180°? Observers, in fact, made an equal number of “180°” and “the other 90°” responses (21.83% of the errors and 21.3% of the errors, respectively; p = .87), so it does not seem to be the case that the advantage for upside-down objects is due to a confusion between the two 90° orientations. Taken together, these results suggest that the ability to judge an object’s orientation is heavily dependent on the object having been identified, and it is genuinely easier to judge the orientation of inverted objects. We will return to this issue in the General Discussion.

Incorrect orientation responses were classified according to the type of error made and are displayed in Fig. 3 (left panel). Given that the upright response was disallowed, in this experiment there are only two kinds of incorrect responses: the orientation of the other object present on that trial (distractor) or reporting an orientation that was not present on that trial (absent). When the object was correctly identified (hit trials), but the orientation was not, participants were significantly more likely to report the orientation of the distractor than an absent orientation (18.9% vs. 13.6%), t(12) = 6.02, p < .001, indicating a bias to report an orientation present in that trial, rather than a random orientation. This bias was present in 12 of the 13 participants, while the remaining one had exactly the same proportion of distractor and absent orientation responses. In contrast, when the object was missed, there was no bias to report the distractor orientation compared with the absent orientation (33.6% vs. 30.1%, p = .418), and neither of these values differed significantly from the correct orientation responses (36.5%; ps > .12). This replicates the results of Experiment 1 and indicates (1) that participants were genuinely guessing orientations when they failed to identify the object and (2) that participants did not have an inherent bias to report the distractor orientation just by virtue of its presence in the stimulus sequence. Rather, these results suggest that the orientation errors made on target-present trials represent genuine misbindings of the target’s identity with the orientation of the other item present in the sequence.

Orientation responses on target-absent trials

To verify whether participants had a bias to respond with particular orientations, we also looked at orientation responses on target-absent trials.Footnote 1 Here, there is no correct orientation response because there is no target. However, one could still observe whether subjects show a bias to report one of the orientations that were present on the trial over an absent orientation. Note that on these trials there are twice as many opportunities to report a distractor orientation—given that there are two “distractors” presented during the trial—as an absent orientation. Therefore, the rate of reporting a distractor orientation was halved, for ease of exposition and comparison with the absent orientation reports (see Fig. 3, left panel).

On trials in which participants correctly reported that the target was absent (i.e., correct rejections), the orientation responses were evenly split between either of the orientations that were present on that trial (33.75%) and an absent orientation (32.46%), indicating that participants had no biases in guessing orientations in the absence of a target object (p = .423). On target-identification false-alarm trials, there was an increased tendency to report a distractor orientation (35.72%) more often than an absent orientation (28.55%), but this difference failed to reach significance (p = .149).Footnote 2 Thus, this analysis supports our conclusion that participants did not have an overall bias to report orientations that had been presented on the trial in the absence of a correct target identification.

In order to verify whether the asymmetry in accuracy rates for the object-detection and orientation-judgements tasks could be accounted for by different guessing rates in the two tasks, we performed the same analysis as in Experiment 1. To estimate the true-hit rate for object detection, we used the formula True Hit Rate = (Observed Hit Rate – False Alarm Rate)/(1 – False Alarm Rate). With a hit rate of 80.84% and a false-alarm rate of 22.3%, this yields a true-hit rate of 74.82%, meaning that 6% of the hit trials could just be lucky guesses. If we assume that observers have access to a bound representation of identity and orientation, and thus know the orientation in every case when they have identified the object, and then guess the orientation on the false-hit (lucky guess) trials, this should produce a maximum correct orientation rate equal to the rate of True Hits + the rate of False Hits/3 (since in this experiment there were only three orientation response choices), as a proportion of observed hits. These calculations yield a maximum correct orientation rate of (74.82 + 6/3)/80.84 = 95.02%. The observed orientation accuracy rate on hit trials of 67.6% is clearly far below this value, which demonstrates that the results cannot be explained by guessing. A further argument against a guessing explanation is that the proportion of correct orientation responses was practically the same as in the Before task of Experiment 1, despite the fact that there the orientation task was a four-alternative forced choice and here it was a three-alternative choice.

Experiment 3

An interesting finding that emerged from the first two experiments is that orientation errors for correctly identified objects are not random. Instead, in both experiments participants produced a significant number of binding errors, reporting the orientation of the alternative object present in that trial. This propensity for misbinding occurred both when observers had to rely on their memory of both items (the After task of Experiment 1) and when they had prior knowledge of what object they needed to look out for and, therefore, could ignore the other object (the Before tasks of Experiments 1 and 2). In the first two experiments, the two objects were presented sequentially in the same spatial location. The aim of Experiment 3 was to test whether the tendency to erroneously bind objects and orientations also occurs when the two objects are presented in different spatial locations.

Method

Participants

Twelve new undergraduate students from the same pool participated in this experiment. One subject was excluded and replaced, due to an exceedingly high false-alarm rate (113/120 target-absent trials).

Procedure

Experiment 3 had an identical design to Experiment 1 (Before task) and Experiment 2, with the exception that instead of the two objects presented on each trial being shown sequentially, they were shown simultaneously on the left and right side of fixation, approximately 2° from fixation (see Fig. 4). The object that had appeared first in Experiment 1 was presented on the left, while the object that had appeared second was presented on the right. The objects were shown simultaneously for 70 ms, preceded and followed by 106-ms-long masks.

Fig. 4
figure 4

Example of the trial structure in Experiments 3 and 4. A picture cue was presented for 1 s in the center of the screen. This was followed by two objects presented in different orientations at the left and right of fixation, which were preceded and followed by masks. The objects were presented for 70 ms in Experiment 3 and for 140 ms in Experiment 4. The masks were presented for 106 ms

Results and discussion

Object-detection accuracy

On average, participants correctly detected the target object on 65.04% of target-present trials (see Table 1). They had a false-alarm rate of 18.4%. This yielded a d′ = 1.52. Not surprisingly, this object-detection task in which two objects are presented simultaneously away from fixation is more difficult than the sequential central vision task used in the first two experiments.

Orientation responses on target-present trials

As in Experiments 1 and 2, participants were significantly more accurate in judging the orientation of the target object on correctly detected (hits) object trials (74.25%) than on missed trials (36.39%), t(11) = 12.75, p < .001 (see Fig. 3, middle panel), although here the mean orientation accuracy on hit trials was somewhat higher than in Experiments 1 and 2. However, similar to Experiments 1 and 2, the performance on missed trials was barely above chance (p = .050). Also similar to Experiment 2, participants were more likely to get the orientation right when the object was rotated by 180° (37.6% of all correct orientation responses) than when it was rotated by 90° clockwise (30.81%; p = .0053), or 90° counterclockwise (31.6%), although the latter difference did not reach statistical significance in this experiment (p = .075). We tested whether observers were more likely to produce a 90° confusion response than a 180° response on trials with targets rotated by 90° and found that in this experiment this was indeed the case (180° responses = 10.27% of the errors vs. “other 90°” responses = 19.39% of the errors; p < .001). Together with the generally higher mean orientation accuracy, this may indicate that observers had a better sense of the object’s orientation in this experiment, at least at the coarse level of 90° versus 180° orientations.

An analysis of the orientation errors on target-identification hit trials revealed a different pattern from that in Experiments 1 and 2. Here, participants were just as likely to report the orientation of the distractor item as an absent orientation (13.14% vs. 12.58%, p = .65). When the object was missed, there was also no bias to report the distractor orientation compared with the absent orientation (32.08% vs. 31.66%, p = .937), and neither of these values differed significantly from the correct orientation responses (36.39%; ps > .07). In other words, when the two objects appeared in different spatial locations, participants were unlikely to misattribute the orientation of the distractor to the target object. These results are displayed in Fig. 3, middle panel.

Orientation responses on target-absent trials

Orientation guesses on target-absent trials were investigated in the same manner as in Experiment 2. On trials in which participants correctly reported that the target was absent (i.e., correct rejections), the orientation responses were evenly split between either of the orientations that was present on that trial (33.28%) and an absent orientation (33.44%), indicating that participants were not biased to guess any particular orientations in the absence of a target object (p = .922). However, on false-alarm trials, there was an increased tendency to report one of the distractor orientations (38.44%) compared with an absent orientations (23.11%), t(11) = 3.17, p = .009. This suggests that when participants made false alarms, they were perhaps misidentifying one of the presented objects as the target and reporting its (correct) orientation.

Experiment 4

It is possible that the difference in the rates of binding errors between the first two experiments and Experiment 3 is due to the more limited processing time available in Experiment 3. There, two objects were shown for a total of 70 ms, whereas in the first two experiments observers had 70 ms per item processing time. To check this, we repeated Experiment 3, but doubled the exposure time in order to equate the time per item with that in Experiments 1 and 2.

Method

Participants and procedure

Twelve new undergraduate students from the same pool participated in this experiment. Experiment 4 was identical to Experiment 3 in every respect, except that the two objects were displayed for 140 ms.

Results

Object-detection accuracy

In Experiment 4, participants correctly detected the target object on 91.67% of trials and had a false-alarm rate of 18.95% (d′ = 2.46; see Table 1). Thus, not surprisingly, the longer stimulus duration made it significantly easier to detect the target.

Orientation responses on target-present trials

The higher target-detection rates notwithstanding, the pattern of orientation judgements was very similar to that in the other experiments. Participants were significantly more accurate in judging the orientation of the target object on correctly detected (hits) object trials (73.15%) than on missed trials (39.48%), t(11) = 4.61, p < .001 (see Fig. 3, right panel). Despite doubling the exposure duration, the proportion of correct orientation responses was essentially identical to that in Experiment 3 (if anything, slightly lower). There were also no more correct orientation responses on missed trials than expected by chance (p = .29), replicating all previous experiments. As in the other experiments, participants were more likely to get the orientation correct when the object was rotated by 180° (37.24% of all correct orientation responses) than when it was rotated by 90° clockwise (31.91%), or 90° counterclockwise (30.84%), ps < .05. However, in line with Experiment 2 (and unlike Experiment 3), this was not because observers were giving more “other 90°” than “180°” responses for targets that were rotated by 90° (“other 90°” = 19.80% of responses, “180°” = 15.72% of responses, p = .603). Thus, again it seems that the accuracy advantage for objects rotated by 180° is not due to confusions between the two 90° orientations.

The pattern of orientation errors replicated that in Experiment 3 (see Fig. 3, right panel). On target-detection hit trials, participants were just as likely to report the orientation of the distractor item as an absent orientation (13.19% vs. 13.66%, p = .55). When the object was missed, there was also no bias to report the distractor orientation compared with the absent orientation (27.68% vs. 32.56%, p = .385), and neither of these values differed significantly from the correct orientation responses (39.48%; ps > .28). Thus, as in Experiment 3, when the two objects appeared in different spatial locations, participants did not misattribute the orientation of the distractor to the target object.

Orientation responses on target-absent trials

Orientation guesses on target-absent trials were examined in the same manner as in the earlier experiments. On trials in which participants correctly reported that the target was absent (i.e., correct rejections), the orientation responses were evenly split between an orientation that was present on that trial (33.99%) and an absent orientation (32.02%), indicating that participants were not biased to guess any particular orientations in the absence of a target object (p =.498). However, on false-alarm trials, there was an increased tendency to report one of the distractor orientations (37.44%) more often than an absent orientation (25.13%), t(11) = 2.96, p = .013. Again, this suggests that when participants made false alarms, they were perhaps misidentifying one of the presented objects as the target and reporting its (correct) orientation (see Fig. 3, right panel).

General discussion

The aim of this study was to test whether the identity and orientation of an object are bound together in a unified percept, or whether they are represented independently during perception and encoding in short-term memory. In general, the results are more consistent with independent coding of identity and orientation, although they also suggest that knowledge of the object’s orientation is contingent on having identified the object.

A consistent finding in all four experiments was that participants demonstrated reasonably high detection rates of the target object, but they were only able to report the orientation of these correctly identified objects approximately 70% of the time. Our analyses show that this asymmetry in object detection versus orientation judgements was not likely to be due to different response demands of the two tasks inducing different rates of guessing. The findings echo the dissociation encountered in patients with orientation agnosia as a result of brain damage, who can recognize and name objects but cannot interpret their orientations (Cooper & Humphreys, 2000; Fujinaga et al., 2005; Harris et al., 2001; Turnbull et al., 1997). Here, we show that a similar pattern can be demonstrated in healthy participants under time constrains. The present findings are also consistent with those of earlier studies by De Caro and Reeves (De Caro, 1998; De Caro & Reeves, 2000), who found that object identity was determined faster than object orientation, and that the orientation-dependent naming functions seen in many object-recognition experiments are driven by double-checking the object’s orientation, rather than its identity (see also Corballis, 1988, for a similar argument).

In all four experiments, orientation judgements were significantly more accurate for correctly identified objects than for missed objects, with the latter being no better than chance. This supports Corballis’s (1988) proposal that determining an object’s orientation is contingent on having first identified it. He argued on logical grounds that unless one knew what the object is, it would not be possible to determine whether, and how, that object is misoriented relative to its usual canonical orientation. The present results provide empirical support for this intuition. Note that this does not mean that observers are not able to judge the global orientation of a shape—as defined by its axis of elongation—if they do not know the identity of that shape. However, whether that global orientation is to be interpreted as “rotated 90 degrees to the left” or “upside down” only makes sense if one knows how the object is normally oriented (e.g., whether it has a vertical or horizontal axis of elongation, and which is the top of the object; or, in more challenging cases, where there is no obvious axis of elongation, where is the top of the object). Our results show that when the participants had not identified the object, they were guessing orientations randomly, which is not surprising given that there is no systematic mapping between the objects’ canonical orientation and the principal axis of elongation of their shapes (e.g., objects depicted upright sometimes have a vertical and sometimes a horizontal axis of elongation, and sometimes none). Thus, even though the participants may have been sensitive to the global orientation of the shapes they were seeing, this did not help them to determine how the objects were oriented.

Across all experiments, we also found that observers were more likely to report the correct orientation when the objects were rotated by 180° than when they were rotated by 90°. This finding echoes some previous results that suggest more rapid and reliable orientation judgement of inverted objects, both in patients with orientation agnosia and in healthy participants (Harris & Dux, 2005a; Harris et al., 2001), as well as more successful individuation of objects in RSVP streams when presented upright and inverted (Dux & Harris, 2007; Harris & Dux, 2005b; Hayward et al., 2010). This is thought to be because when an object is inverted, its principal axis corresponds to that of the object’s stored representation in memory, and one only needs to assess the polarity correspondence of these axes (e.g., whether the top of the object is at the expected top location, rather than at the bottom), whereas when an object is rotated by 90°, an additional step of establishing the axis correspondence is required (Harris et al., 2001; McCloskey, 2009). A potential alternative explanation for the advantage seen for 180° is that the two 90° orientations are more confusable. Our analysis of the distribution of orientation responses in Experiment 3 would be consistent with this explanation. However, the results of Experiments 2 and 4 did not conform to this pattern, so, overall, we do not have convincing evidence that this accounts for the 180° advantage.

An important finding of the present study is that the types of errors were different in the sequential versus simultaneous presentation experiments. When the two objects were presented sequentially in the same spatial location, participants were significantly more likely to report the orientation of the distractor present on that trial than an absent orientation—in other words, to misbind the identities and orientations of the objects. This bias was almost universally observed when the task minimized memory demands (it was present in all but one participant out of 24 across the Before task of Experiment 1 and Experiment 2, with the remaining participant giving exactly the same number of both types of responses). It was also present in the majority of participants (8/12) in the After task of Experiment 1, which relied more heavily on short-term memory, although in that case this bias might have been obscured to some extent by an increased tendency to give “upright” responses, which were an allowable option in that experiment. In contrast, when the two objects appeared in different locations simultaneously, we no longer observed a bias to report the orientation of the distractor object (i.e., a binding error) over an absent orientation when observers made orientation errors. It is worth reiterating that when participants missed the target object altogether, they showed no bias to report an orientation present on the trial over an absent one, but rather guessed one of the three possible orientations with equal probability. Thus, the bias seen on target-detection hit trials suggests that participants are sensitive to the orientation of the identified object, but this orientation information is sufficiently loosely bound as to be occasionally attributed to a different object.

There is one potential alternative explanation for the bias to report the distractor’s orientation rather than an absent orientation that we observed in Experiments 1 and 2. On the false-alarm trials, observers were more likely to report the orientation of one of the objects present on the trial (both “distractors” in this case) than an absent orientation. There was a small but nonsignificant trend in this direction in Experiment 2, and the effect was larger and significant in Experiments 3 and 4. The most parsimonious explanation for this is that observers simply misidentified one of the objects as the target and reported its orientation. Could a similar explanation apply to the bias found on hit trials in Experiments 1 and 2? In other words, could this bias represent simple misidentification of the distractor as the target, rather than a misbinding of identity and orientation? We do not think so. If that were the case, we should see this bias in Experiments 3 and 4, where the evidence from the false-alarm trials is stronger, yet we do not. Given this, we argue that the bias to report the distractor object’s orientation in the first two experiments reflects genuine misbinding of the target’s identity with the orientation of the other object present in the sequence.

The propensity for binding errors corroborates the findings of a previous study by Corballis, Armstrong, and Zhu (2007), who used RSVP sequences of letters presented in varying orientations. In that study, participants were probed with a letter either before or after a RSVP stream and had to report the orientation of the cued letter from among three letters presented in the stream. The participants frequently reported the orientation of another letter present in the stream, particularly if they were probed after the RSVP stream. Corballis et al. interpreted this as evidence that both identity and orientation are processed during RSVP, but are stored in independent visual short-term memory stores (see also Wheeler & Treisman, 2002). Here, we see a tendency to misbind the identity and orientation of familiar objects even when the cue appears before the stream and the stream only contains two objects, thus minimizing the memory demands of the task. Thus, our results provide stronger evidence that participants might never form a fully bound representation depicting an object in a specific orientation (i.e., a holistic view-based representation).

The results of Experiments 3 and 4 suggest that presenting objects in different spatial locations offers some protection from featural interference from distractor objects. This finding corroborates the results of Pertzov and Husain (2014) mentioned in the Introduction. In Pertzov and Husain’s study, simple bars were presented sequentially in different colors and orientations, and participants tended to incorrectly report the orientation of a different bar presented in the sequence when the bars shared the same location, but not when they were in different locations. Thus, our present findings, which are based on a higher-level conceptualization of orientation, together with those of Pertzov and Husain, are consistent with the idea that spatial location can protect the integrity of feature conjunctions and act to individuate object representations (Treisman & Gelade, 1980; Treisman & Zhang, 2006; Wheeler & Treisman, 2002). However, it should be acknowledged that other factors differed between the first two and last two experiments, in addition to whether or not the objects shared the same spatial location. For one, in Experiments 1 and 2 the objects were presented centrally in the fovea, whereas in Experiments 3 and 4 they were presented peripherally. While we do not think that this is the reason for the difference, given that the stimulus location could more accurately be described as parafoveal (approx. 2° from fixation), this possibility could be investigated in future studies by presenting the stimuli sequentially in the same peripheral spatial location. A second possibility is that the difference may be due to the sequential vs simultaneous nature of the presentation, rather than the shared (or not) spatial location. In support of this idea, a previous study of working memory fidelity for object features found that working memory is particularly vulnerable to feature misbindings when the objects are presented sequentially (Gorgoraptis, Catalao, Bays, & Husain, 2011). This study, which used line orientation and color as features, demonstrated the occurrence of feature misbindings when items were presented sequentially, even when these sequential items occurred in different spatial locations. The authors argued that sequential presentation taxes working memory resources and this leads to misbinding of features between objects in the sequence, even when the objects are individuated through spatial location. This remains a potential explanation for the present results, as it is not possible to disambiguate sequential presentation from spatial location in our current experiments.

In conclusion, the present findings provide clear evidence in favor of the idea that object identity and orientation are perceived independently of each other, but determining the object’s orientation is contingent on having first identified the object. This asymmetry in resolving the object’s identity and orientation can give rise to incorrect conjunctions of object attributes when multiple objects are presented sequentially in the same spatial location.