Introduction

Mental rotation refers to imagining a visual stimulus rotating to a different orientation than the one depicted. The behavioral signature of mental rotation is an increase in response times as a function of the stimulus’ orientation from the upright (Cooper & Shepard, 1973; Shepard & Metzler, 1971). This mental transformation of visual information has been suggested to be involved in our decision-making processes during a task that requires deciding if rotated objects would face to the left or to the right when imagined at the upright (Jolicoeur, 1985). The symmetric nature of the response time function is the evidential basis for the theoretical assumption that mental rotation is performed through the shortest angular distance to the upright.

Mental rotation is thought to be used on every trial when comparing line drawings of three-dimensional cube figures (Shepard & Metzler, 1971). However, it has been suggested that to determine if a rotated object would face to the left or to the right when imagined at the upright, a mixture of mental rotation and a non-rotation process is used (Kung & Hamm, 2010; Searle & Hamm, 2012). Rotation to the upright through the shortest angular distance requires knowledge of information pertaining to the current orientation of the stimulus and its upright orientation. Both of these pieces of information are defined by the identity of the stimulus, and therefore it is assumed that the stimulus has been identified, to some extent, prior to the occurrence of mental rotation (Corballis, 1988; Hamm & McMullen, 1998). Based upon this information the shortest direction of rotation to the upright can be determined.

Kung and Hamm (2010) proposed that the intended direction of rotation can be used to polarize a stimulus’ horizontal axis without actually mentally rotating that stimulus to the upright. The use of mental rotation, however, is suggested to be necessary to reduce the spatial conflict that arises between the assigned polarity within the object-centered coordinate system and the polarity of the viewer-centered coordinate system. This spatial conflict increases with a stimulus’ orientation from the upright and so the ratio of mental rotation to non-rotation trials varies as a function of orientation (Kung & Hamm, 2010) and also varies between individuals (Searle & Hamm, 2012).

An individual’s response time function found for decisions about whether rotated objects would face to the left or to the right when imagined at the upright, can be modelled by the Mixture Model (Kung & Hamm, 2010; Searle & Hamm, 2012) as follows:

$$ \mathrm{R}\mathrm{T}{\theta}_{\mathrm{s}} = \mathrm{baseline} + {\left({\theta}_{\mathrm{s}}/180\right)}^{\mathrm{x}}*\ \left({\theta}_{\mathrm{s}}*\mathrm{orientation}\ \mathrm{effect}\right) $$
(1)

The term “θs” corresponds to the orientation of an object in terms of the smallest angular departure from the upright. The term “baseline” corresponds to an individual’s mean response time to upright stimuli, collapsed over the left and right response options. The stimulus orientation is converted to a proportion of the degree of inversion by dividing it by 180°. The term “orientation effect” corresponds to the increase in response times per degree of stimulus orientation, as determined by the difference in response times between objects rotated 180° and objects at 0°.Footnote 1 The term “X” is an exponent parameter calculated by iteration to minimize the squared error between the modelled and observed response times. The value for X is inversely related to the proportion of trials employing mental rotation.

The assumption that mental rotation is carried out through the shortest angular distance to the upright is based upon response times being symmetric around 180°. However, this symmetry is only found when the data for left and right responses are combined. When the data for left and right responses are examined separately the response time functions are asymmetric (Jolicoeur, 1985, 1988; Searle & Hamm, 2012). Responses indicating that objects would face to the left if imagined at the upright tend to be faster when the direction of the shortest rotation is in a counter-clockwise direction compared to in a clockwise direction. The opposite pattern is found for responses corresponding to objects facing to the right. It is this asymmetry in the response time functions that is of primary interest for the current study because it could bring into question a major underlying assumption in the literature, which is that mental rotation is performed through the shortest direction to the upright.

The Mixture Model with the additional exponent parameter (Kung & Hamm, 2010; Searle & Hamm, 2012) does not account for asymmetric response time functions found when deciding whether rotated objects would face to the left or to the right if imagined at the upright. When one considers this task, it becomes apparent that response times are faster when the direction of the shortest rotation corresponds to an object being imagined rotating in the direction it faces. Here we propose that asymmetries of response time functions could be accounted for by the Mixture Model by simply dropping the assumption that mental rotation proceeds through the shortest direction to the upright on the trials it is employed and making the alternative assumption that an object is rotated in the direction it faces. We suggest that rotating in the direction an object faces only affects the distance rotated, not the proportion of trials on which rotation occurs. As this alternative explanation simply replaces one assumption of the Mixture Model with another it does not result in any additional model parameters and so, if comparing models, the model that accounts for more of the variance is to be preferred. To give an example, a left-facing object presented at an orientation of 240° clockwise would be rotated either 120° (shortest distance assumption) or 240° (direction of facing assumption) on a proportion of the trials equal to (120/180)x . Under both assumptions the non-rotation process is thought to be employed on the remainder of the trials. For each participant, the exact proportions are modified by the participant’s exponent value.

In tasks requiring individuals to decide whether objects would face to the left or to the right when imagined at the upright, the objects are typically animals or vehicles that have well-defined fronts and backs. Objects are thought to be identified prior to mental rotation (Corballis, 1988; Hamm & McMullen, 1998) and therefore it is possible that semantic information pertaining to the typical direction of movement would be available at this stage. This semantic information about an object’s mobility may bias rotating the object in the direction it typically moves. If this is the case, then the response time functions for objects that are known to move, such as vehicles and animals, may be more asymmetric than the response time functions for objects that may have a direction of facing but do not move, such as chairs or houses. This predicted difference in asymmetries could be due to the rate of mental rotation being facilitated by the semantic knowledge of these objects’ mobility, in which case the orientation effect captured by the Mixture Model would also be expected to be smaller for objects known to move compared to objects known to be stationary. Another possible cause of the predicted difference in asymmetries could be that the direction of mental rotation is influenced by the semantic knowledge of what way an object would typically move. If this is the case then it is predicted that the orientation effects should not differ depending on object mobility, but the asymmetry of the response time function should differ.

If a semantic classification regarding a stimulus’ mobility has been made prior to mental rotation, this semantic information could also influence the proportion of trials on which a mental rotation strategy is employed. For example, when presented with a rotated object one might be more likely to employ a mental rotation strategy if this object is typically seen moving because the mental movement corresponds to the mobility-related semantic information known about this object. If mental rotation is more likely to be employed for mobile objects compared to immobile objects then this would be captured by differences in the curvature of individuals' response time functions, as indexed by the exponent parameter applied to the proportional mixture of mental rotation and non-rotation trials (Kung & Hamm, 2010; Searle & Hamm, 2012).

Viggiano and Vannucci (2002) found that individuals were slower to identify mobile objects (i.e., animals and vehicles) compared to immobile objects (i.e., furniture and tools). In the Mixture Model (Kung & Hamm, 2010; Searle & Hamm, 2012), the baseline parameter is thought to capture the time taken to carry out pre-rotation processes, including identification of the stimulus. Therefore, it would be expected that baseline response times would be larger for objects typically seen moving compared to objects typically seen stationary.

In the current study, participants were presented with line drawings depicting objects in profile that, when upright, were facing to the left or to the right. Half of the stimuli depicted objects that typically move (mobile objects), such as vehicles or animals, while the other half of the stimuli were stationary objects (immobile objects), such as chairs or houses. Response times to decide if an object faced to the left or to the right when imagined at the upright were measured and parameters for the Mixture Model were fitted to individual participants’ response time functions based either on the assumption that objects were rotated the shortest route to the upright or that objects were rotated in the direction they faced. The squared error for the predicted data was compared to the squared error relative to a participant’s overall mean to provide a measure of the goodness of fit for each of the models, which is specifically the percentage change in the sum of the squared error. This goodness of fit measure has a maximum value of 100 % if the model captures all of the observed data but it is not bounded at zero and can be negative if the model produces more error than the mean. Comparing the goodness of fits between the two models allows for a test of the assumption that stimuli are rotated in the shortest direction to the upright.

The secondary aim of the current study was to investigate if mental rotation is influenced by the semantic knowledge associated with an object’s normal movement. This was achieved by comparing model parameter values as fitted to mobile and immobile object sets. Correlation analyses were performed between parameters of the mobile and immobile object sets as a test-retest reliability measure. It was expected that even if parameters differ in value between the object sets, there should be a positive relationship between the values for the baseline response times, orientation effects, and exponent values at the individual level.

Males consistently outperform females on the paper-and-pencil versions of Vandenberg and Kuse’s (1978) Mental Rotations Test (Linn & Petersen, 1985; Voyer, Voyer, & Bryden, 1995); however, for most types of stimuli, the results of computerized tests of mental rotation indicate that the speed of mental rotation does not differ between males and females (Jansen-Osmann & Heil, 2007). To examine whether or not sex differences were present in the current study, performance measures were compared between males and females.

Method

Participants

Twenty-five participants were recruited through the University of Auckland’s School of Psychology website, advertisements posted on notice boards within the University of Auckland, and through personal communications. Data from one participant were excluded from analysis due to a programming error that had images presented sequentially, not in a random order. The 24 remaining participants (12 male/12 female) met the inclusion criteria of a mean accuracy across each object set of at least 75 %, as well as mean accuracies for responses corresponding to left- or right-facing objects at any of the six orientations of at least 60 % in each object set.

Participants ranged in age from 18 to 32 years (M = 22.71, SD = 4.51). All 24 participants were right-handed (laterality quotient range 50–100; M = 77.17, SD = 16.46), as assessed by the Edinburgh Inventory (Oldfield, 1971). All participants reported normal or corrected-to-normal vision. Participants were informed that the experiment would take approximately 1 hour to complete. Each participant was reimbursed NZ$10 worth of petrol vouchers. All participants gave their informed consent and procedures were approved by the University of Auckland Human Participants Ethics Committee.

Materials

Stimuli were presented on a color computer monitor with a screen resolution of 1,920 × 1,080 pixels. Luminance levels of the monitor screen were measured for white (M = 99.78 cd/m2, SD = 1.33) and black colors (M = 2.29 cd/m2, SD = .51), using a Konica-Minolta LS-110 luminance meter. E-Prime Version 2.0 Psychology Software (Schneider, Eschman, & Zuccolotto, 2002) was used for stimulus presentation and to record response times and accuracies. Responses were made on a number pad that was located in front of each participant and was in line with the horizontal midline of the computer monitor.

Each object set consisted of six different line drawings of objects (taken from Hamm & McMullen, 1998; Snodgrass & Vanderwart, 1980; see Fig. 1a and b), with each object presented 12 times facing to the left and 12 times facing to the right at every orientation (0°, 60°, 120°, 180°, 240°, and 300°; see Fig. 1c and d).

Fig. 1
figure 1

Objects presented left facing at 0° for the immobile object set (a) and for the mobile object set (b). An example presented at each six orientations for left facing objects (c) and right facing objects (d). Grey arrowheads indicate the direction of mental rotation as assumed by the Shortest Distance model. Black arrowheads indicate the direction of mental rotation as assumed by the Direction of Facing model

The widths of the objects were between 4.0° and 4.4° of visual angle and the heights of the objects were between 2.6° and 4.2° of visual angle. Participants used a chin-rest to keep their eyes 57 cm from the center of the screen.

Procedure

Object sets were presented in separate blocks to maximize the ability to use semantic information if it is possible. For both object sets, participants completed three blocks of 288 trials consisting of each of the six objects being presented four times for each facing and orientation combination. Participants were given a break between each block. Task order was counterbalanced across equal numbers of male participants and female participants. Participants were instructed to indicate whether an object faced to the left or to the right when imagined at the upright. Participants responded using their index and middle fingers of their right hands and they were instructed to press the “1” key if they thought that the object faced to the left when imagined at the upright, or to press the “2” key if they thought that the object faced to the right when imagined at the upright. Each object remained on the screen until the participant responded or until 4,000 ms had passed. Within each block, trials were presented in a randomized order and were separated by inter-trial intervals of 1,000 ms. Participants were instructed not to flip inverted objects to the upright through the three-dimensional plane.

Data analysis

Correct response times were tested for outliers, which were eliminated using a recursive procedure with a sliding criterion to minimize sample size bias (Van Selst & Jolicoeur, 1994). The mean correct response times, after outlier elimination, were then analyzed.

When testing the model parameters, the normality of each parameter’s distribution was tested using a Shapiro-Wilk test. If data were normally distributed parametric tests were used, and when data violated the assumptions of normality non-parametric tests were used. When appropriate, the alpha level for significance was adjusted by Bonferroni correction for multiple pairwise comparisons, otherwise the standard level of alpha equals 0.05 was used for determining significance.

Results

Figures depicted in the following section show data at 0° replicated at 360° for the visual assessment of symmetry. These replicated data at 360° were not used in any of the following statistical analyses.

Specified planned contrasts aided in assessing symmetry of the observed orientation functions. This was achieved by comparing values at 60° with 300°, and 120° with 240°. Orientation effects were tested by assessing the linear trend component between values at 0° and 180° (analogous to rates of mental rotation). The quadratic trend (analogous to the curvature of the orientation function), and the cubic trend were also tested. These contrasts were used for all reported ANOVAs with orientation (6) as a within-subjects factor. ANOVA results are reported using Greenhouse-Geisser corrected degrees of freedom if sphericity was violated.

Mean response times

Mean response times were analyzed with a three-way, within-subjects ANOVA with object set (2), object facing (2), and orientation (6) as factors. Results demonstrated a significant main effect of orientation (F(1.60, 36.72) = 85.96, p < .001, ηp 2 = .79) with contrasts showing significant linear (F(1, 23) = 123.20, p < .001, ηp 2 = .84) and quadratic trends (F(1, 23) = 32.22, p < .001, ηp 2 = .58). There was a significant interaction between object set and the quadratic trend (F(1, 23) = 7.86, p = .010, ηp 2 = .26). There was a significant interaction between object facing and orientation (F(2.65, 60.91) = 24.81, p < .001, ηp 2 = .52), with contrasts showing significant interactions between object facing and the symmetries of 60° and 300° (F(1, 23) = 14.07, p = .001, ηp 2 = .38), and 120° and 240° (F(1, 23) = 65.96, p < .001, ηp 2 = .74). Of interest to the predictions made in the current study, there were no significant interactions between object set, object facing and the symmetries of 60° and 300° (F(1, 23) = .003, p = .960, ηp 2 = .00), or 120° and 240° (F(1, 23) = 2.26, p = .147, ηp 2 = .09; see Fig. 2a and b).

Fig. 2
figure 2

Mean observed response times as a function of orientation for the mobile object set (a) and for the immobile object set (b). Mean observed accuracies as a function of orientation for the mobile object set (c) and for the immobile object set (d)

Mean accuracies

Mean accuracies were analyzed with a three-way, within-subjects ANOVA with object set (2), object facing (2), and orientation (6) as factors. Results demonstrated a significant main effect of orientation (F(1.89, 43.46) = 35.72, p < .001, ηp 2 = .61) with contrasts showing significant linear (F(1, 23) = 53.31, p < .001, ηp 2 = .70) and quadratic trends (F(1, 23) = 30.34, p < .001, ηp 2 = .57). There was a significant interaction between object set and orientation (F(3.27, 75.13) = 4.22, p = .007, ηp 2 = .16), with contrasts showing significant interactions between object set and the linear (F(1, 23) = 6.43, p = .018, ηp 2 = .22) and quadratic trends (F(1, 23) = 8.32, p = .008, ηp 2 = .27). There was a significant interaction between object facing and orientation (F(2.25, 51.72) = 11.37, p < .001, ηp 2 = .33), with contrasts showing significant interactions between object facing and the symmetries of 60° and 300° (F(1, 23) = 8.53, p = .008, ηp 2 = .27), and 120° and 240° (F(1, 23) = 19.11, p < .001, ηp 2 = .45). There was a significant interaction between object set, object facing, and the symmetry of 120° and 240° (F(1, 23) = 4.65, p = .042, ηp 2 = .17; see Fig. 2c and d).

Sex differences in response times and accuracies

The same ANOVAs were repeated, but this time including participants’ sex as a between-subjects variable and only the effects or interactions involving sex will be reported. For mean response times, although the overall interaction between sex and orientation was not significant (F(1.56, 34.37) = 3.25, p = .062, ηp 2 = .13), there was a significant interaction between sex and the cubic (F(1, 22) = 7.13, p = .014, ηp 2 = .25) and quadratic trends (F(1, 22) = 13.54, p = .001, ηp 2 = .38), in which response time functions for female participants were more curved than those for male participants. There was a significant interaction between sex and object facing (F(1, 22) = 7.93, p = .010, ηp 2 = .27). Although the three-way interaction between sex, object set, and orientation was not significant (F(1.51, 33.29) = 2.16, p = .142, ηp 2 = .09), there was a significant interaction between sex, object set, and the symmetry of 60° and 300° (F(1, 22) = 4.46, p = .046, ηp 2 = .17). Although the four-way interaction between sex, object facing, object set, and orientation was not significant (F(3.22, 70.75) = 1.75, p = .161, ηp 2 = .07), there was a significant interaction between sex, object facing, object set, and the quadratic trend (F(1, 22) = 4.38, p = .048, ηp 2 = .17) and the symmetry of 60° and 300° (F(1, 22) = 8.26, p = .009, ηp 2 = .27), but not the symmetry of 120° and 240° (F(1, 22) = 1.10, p = .307, ηp 2 = .05).

For mean accuracies, although the overall interaction between sex and orientation was not significant (F(1.94, 42.77) = 2.79, p = .074, ηp 2 = .11), there was a significant interaction between sex and the symmetry of 60° and 300° (F(1, 22) = 8.71, p = .007, ηp 2 = .28). There was a significant interaction between sex and object facing (F(1, 22) = 4.94, p = .037, ηp 2 = .18). There was a significant interaction between sex, object facing and orientation (F(2.51, 55.19) = 3.59, p = .025, ηp 2 = .14) with contrasts showing a significant interaction between sex, object facing, and the symmetry of 120° and 240° (F(1, 22) = 5.34, p = .031, ηp 2 = .20). Although the three-way interaction between sex, object set, and orientation was not significant (F(3.43, 75.55) = 2.38, p = .068, ηp 2 = .10), there was a significant interaction between sex, object set, and the linear trend (F(1, 22) = 4.71, p = .041, ηp 2 = .18).

Calculations of Mixture Model parameters

Kung and Hamm’s (2010) Mixture Model, with the addition of Searle and Hamm’s (2012) proposed exponent parameter, was used to predict individuals’ response times based on two separate assumptions. The first assumption was that when objects are mentally rotated they are rotated through the shortest angular distance to the upright (see Eq. 2 and Fig. 3). The second assumption was that when objects are mentally rotated they are rotated in the direction they face (see Eq. 3 and Fig. 3).

Fig. 3
figure 3

Observed and predicted response times, combined across object sets, after response times for right facing objects have been flipped over 180° and then collapsed with left facing response times

$$ \mathrm{R}\mathrm{T}{\theta}_{\mathrm{s}} = \mathrm{baseline} + {\left({\theta}_{\mathrm{s}}/180\right)}^{\mathrm{x}}*\ \left({\theta}_{\mathrm{s}}*\mathrm{orientation}\ \mathrm{effect}\right) $$
(2)
$$ \mathrm{R}\mathrm{T}{\theta}_{\mathrm{f}} = \mathrm{baseline} + {\left({\theta}_{\mathrm{s}}/180\right)}^{\mathrm{x}}*\ \left({\theta}_{\mathrm{f}}*\mathrm{orientation}\ \mathrm{effect}\right) $$
(3)

In Eq. 2 the θs term reflects the orientation of an object from the upright under the assumption that objects are mentally rotated in the direction of the shortest angular distance from the upright and, therefore, that response time functions are symmetric around 180°. This model predicts response times that are collapsed across left and right responses and collapsed across 180°. For example, θs would be 120° for left- or right-facing objects presented at either 120° or 240° from the upright.

In Eq. 3 the θf term reflects the orientation of an object from the upright under the assumption that objects are mentally rotated in the direction an object faces and it produces response time functions that are not symmetric around 180°. When collapsing the data from left- and right-facing objects over orientation, the orientation is considered in terms of degrees to the upright in the direction that an object faces, rather than degrees clockwise. For example, θf would be 120° for left-facing objects rotated 120° clockwise from the upright and it would also be 120° for right-facing objects rotated 240° clockwise from the upright. Additionally, θf would be 240° for left-facing objects rotated 240° clockwise from the upright and also 240° for right-facing objects rotated 120° clockwise from the upright.

Evaluation of the two models

To evaluate whether the Shortest Distance model or the Direction of Facing model best captured individuals' mean response times combined across object sets, the goodness of fit values were calculated for each model (see Appendix 1 for individuals’ parameter values and goodness of fit values). For the Direction of Facing model, the highest goodness of fit value was 97.92 %, whereas the lowest goodness of fit value was 69.06 %. In comparison, for the Shortest Distance model, the highest goodness of fit value was 97.54 %, whereas the lowest goodness of fit value was 48.63 %. The results of a related-sample Wilcoxon Signed Ranks Test showed that the goodness of fit was significantly greater for the Direction of Facing model (Mdn = 89.8 %) than for the Shortest Distance model (Mdn = 87.5 %; Z = −2.80, p = .005, r = .57; see Fig. 4).

Fig. 4
figure 4

Box-and-whisker plot showing the median, 25th percentile, and 75th percentile of the goodness of fit values for the Shortest Distance model and for the Direction of Facing model. These goodness of fit values were calculated after combining object sets. The whiskers extend to the minimum and maximum goodness of fit values and are less than 1.5 times the interquartile range in length; values outside this range are plotted as single data points

In addition, an Akaike Information Criterion value (AICc; Burnham & Anderson, 2002), including the correction for a small-sample bias, was calculated for each model. The Direction of Facing model produced a lower AICc (2281) compared to the Shortest Distance model (2347), indicating that the Direction of Facing model was the preferred model. All further analyses involving individuals’ parameters will use parameters derived from the Direction of Facing model (see Appendix 2 for plots of observed response times and predicted response times derived from the Direction of Facing model for each participant).

The initial model comparisons were based on response times collapsed across mobile and immobile object sets. To examine objects sets separately, the goodness of fit was calculated for the Shortest Distance model and the Direction of Facing model for each object set. For the mobile object set, one participant’s goodness of fit value was negative (−2.25 %) for both models, indicating that neither model provided a better fit than their grand mean response time. This participant was dropped from all further analyses because they showed slower response times for upright right-facing objects than for any other orientation resulting in unlikely estimates for their model parameters. While a better fit was obtained if their mean response time for upright left-facing objects was employed as their baseline response time, it was deemed more prudent to omit the participant from the analyses altogether.Footnote 2

The goodness of fit values for the two models were compared using a related-sampled Wilcoxon Signed Ranks Test. Bonferroni-corrected alpha levels of 0.025 (0.05/2) were used to evaluate significance. Results showed that the goodness of fit was significantly greater for the Direction of Facing model than for the Shortest Distance model for both the mobile object set (Mdn = 90.4 % vs. Mdn = 83.0 %; Z = −2.71, p = .007, r = .56) and the immobile object set (Mdn = 90.0 % vs. Mdn = 86.5 %; Z = −2.46, p = .014, r = .51).

Influence of object mobility on parameters

While the previous ANOVA conducted on response times indicated that the asymmetries of the response time functions do not significantly differ between mobile and immobile object sets, to examine if the other aspects of performance differed, related-samples Wilcoxon Signed Ranks tests were carried out to compare individuals’ baseline response times, orientation effects, and exponent values between mobile and immobile object sets. Bonferroni-corrected alpha levels of 0.017 (0.05/3) were used to evaluate significance. Individuals’ baseline response times were significantly longer for the mobile object set (Mdn = 538.6 ms) than for the immobile object set (Mdn = 517.4 ms; Z = −2.71, p = .007, r = .56; see Fig. 5a). Individuals’ orientation effects did not differ significantly between the mobile object set (Mdn = 1.5 ms/°) and the immobile object set (Mdn = 1.4 ms/°; Z = -.94, p = .346, r = .20; see Fig. 5b). Individuals’ exponent values were significantly smaller for the mobile object set (Mdn = 1.7) than for the immobile object set (Mdn = 2.0; Z = −2.83, p = .005, r = .59; see Fig. 5c), indicating that mental rotation is employed on a greater proportion of trials for the mobile object set compared to the immobile object set.

Fig. 5
figure 5

Box-and-whisker plots for baseline response times (a), orientation effects (b), and exponent values (c) based on the Direction of Facing model for the mobile and immobile object sets

Spearman’s rank-order correlations were conducted to examine relationships between corresponding parameters derived from the Direction of Facing model for each object set. Bonferroni-corrected alpha levels of 0.017 (0.05/3) were used to evaluate significance. One-tailed tests were used because these parameters are thought to reflect individual characteristics in task performance and therefore positive correlations between the parameters were the only relationships to be expected. Baseline response times (rs(21) = .70, p <.001; one-tailed; see Fig. 6a), orientation effects (rs(21) = .48, p = .010; one-tailed; see Fig. 6b), and exponent values (rs(21) = .78, p < .001; one-tailed; see Fig. 6c) were all significantly correlated between the mobile and immobile object sets.

Fig. 6
figure 6

Scatter plots showing correlations between ranked baseline response times (a), orientation effects (b), and exponent values (c) for the mobile and immobile object sets

Spearman’s rank-order correlations were also conducted within each object set to examine the relationships between the different parameters. Bonferroni-corrected alpha levels of 0.008 (0.05/6) were used to evaluate significance. The only correlation to reach significance was between exponent values and orientation effects (rs(21) = .55, p = .007) within the mobile object set.

Sex differences in parameters

To simplify the approach used to evaluate the presence of sex differences in task performance baseline response times, orientation effects and exponent values were analyzed with separate two-way mixed ANOVA with object set (2) and sex (2) as factors. There were no main effects or interactions involving sex for baseline response times or for orientation effects (all p > .05). For exponent values, there was a significant main effect of sex (F(1, 21) = 11.67, p = .003, ηp 2 = .36), and a post hoc test revealed that the exponent values were significantly larger for females (M = 2.25; SE = .15) compared to males (M = 1.55; SE = .14). There was no significant interaction between object set and sex (F(1, 21) = .84, p = .370, ηp 2 = .04). These results are consistent with the aforementioned ANOVA that tested for sex differences in response times.

Discussion

Individuals’ response times for decisions about the direction of facing of rotated objects were modelled by Kung and Hamm’s (2010) Mixture Model, with individual differences in curvature of the response time functions captured by applying an exponent parameter to the proportional mixture of mental rotation and non-rotation trials (Searle & Hamm, 2012). These exponent values were fitted to response times under two assumptions. The first assumption was that, when mental rotation was employed, objects would be rotated through the shortest angular distance to the upright. Under this assumption the Mixture Model predicts response times to be symmetric around 180°. The second assumption was that, when mental rotation was employed, objects would be rotated in the direction they faced. Under this assumption the Mixture Model was adapted, without the addition of any new parameters, to capture the asymmetric nature of response times around 180°. Overall, the latter assumption provided a better fit to the majority of individuals’ response times, with the response times of 18 participants being better fit by the Direction of Facing model and the response times of five participants being better fit by the Shortest Distance model. These results suggest that information regarding the direction that an object faces influences the direction in which that object is mentally rotated, regardless of whether or not the path of rotation corresponds to the shortest angular distance to the upright.

An alternative interpretation for the asymmetric nature of the response time functions is that these asymmetries are due to the vertical location of the fronts of the objects on the screen, whereby response times are faster when the fronts of the objects are located closer to the top of the screen. This alternative is derived from the finding that when deciding if a dot is at the top or at the bottom of a rotated object there is evidence that response times are faster when the dot is located closer to the top of the screen (Jolicoeur, Ingleton, Bartram, & Booth, 1993; Light & Hamm, 2008). However, it should be noted that when the decision is based on whether a dot is at the front or at the back of a rotated object there is no evidence to suggest that response times differ depending on the vertical location of the dot on the screen (Jolicoeur et al., 1993).

As shown in Fig. 7, the vertical locations of the fronts of objects on the screen are the same for objects rotated 60° clockwise and 300° clockwise as for objects rotated 120° clockwise and 240° clockwise, and therefore the distances between the high locations and the low locations of the fronts of the objects are the same for both pairs of orientations. If the asymmetries of the response time functions are due to the differences in the vertical locations of the fronts of the objects on the screen, then the asymmetry between objects rotated 60° and 300° should be equal to the asymmetry between objects rotated 120° and 240°. A paired samples t-test compared asymmetry values, calculated by finding the difference in response times for objects presented with their fronts low on the screen compared to objects with their fronts high on the screen, for objects rotated 60° and 300° to those for objects rotated 120° and 240°. Asymmetry values were significantly larger for objects rotated 120° and 240° (M = 76.61 ms, SD = 45.25 ms) than for objects rotated 60° and 300° (M = 20.14 ms, SD = 25.74 ms; t(22) = −8.05, p < .001, r = .86). This result indicates that it is not likely that the asymmetries arise due to the vertical locations of the fronts of objects on the screen.

Fig. 7
figure 7

Illustration of the vertical locations of the fronts of objects presented at 60°, 120°, 240°, and 300° from the upright

The objects used in the current study depict familiar objects and, therefore, semantic knowledge regarding the typical movements of these objects would have developed and been consolidated over many years. The asymmetries of the response time functions and the orientation effects were similar for the mobile object set and the immobile object set, suggesting that, when making a decision about whether an object faces to the left or to the right if imagined at the upright, semantic information pertaining to the common movements of that object does not appear to influence the direction in which the object is rotated, nor does it influence the speed at which one mentally rotates it. There is also evidence that the speed of mental rotation is not influenced when an attempt is made to develop knowledge, regarding the movements of novel stimuli, immediately prior to the mental rotation process taking place (Borst, Kievit, Thompson, & Kosslyn, 2011). It appears that the speed of mental rotation is not influenced by mobility-related semantic knowledge regardless of the how well developed this knowledge is.

While one’s knowledge about the mobility of objects does not influence the speed or the direction of mental rotation, individuals’ response time functions tended to be less curved for mobile objects compared to immobile objects, suggesting that mental rotation was relied upon more often for mobile objects. This finding indicates that semantic knowledge of an object’s mobility may influence the tendency to employ a mental rotation strategy to decide if that object would face to the left or to the right when imagined at the upright. It could be that the option to mentally move an object, by means of a rotation, is more semantically compatible with objects that are typically seen moving compared to objects that are typically seen stationary.

Longer baseline response times for the mobile object set compared to the immobile object set indicates that one or more of the processes prior to, or subsequent to, mental rotation took more time to complete for the mobile object set. This response time benefit for immobile objects over mobile objects is consistent with Viggiano and Vannucci’s (2002) finding that individuals tend to identify immobile objects (i.e., furniture and tools) faster than mobile objects (i.e., animals and vehicles). Our finding provides further evidence that the baseline parameter is influenced by pre-rotation processes such as object identification.

Initial object identification is thought to be based on global shape properties of an object (Bar, 2003; Hamm & McMullen, 1998). Although the current study suggests that semantic information regarding the mobility of an object influences pre-rotation processes and the probability of using a mental rotation strategy, this might not always be the case. The stimuli used for the current study had visually distinct structural properties; however, if the perceived structure of an object, such as an immobile toy horse, is highly similar to the structure of another semantic class of objects, such as a mobile horse, then one might process these two objects as if they belonged to the same mobility-related semantic category. Given the small sample of objects used in the current study, further investigations with a larger set of stimuli are required to further test this idea.

Despite the differences in baseline response times and exponent values, all three parameters were correlated between the mobile object set and immobile object set at the individual level. These correlations provide further evidence to suggest that these parameters capture individual characteristics relating to successful decisions about whether rotated objects would face to the left or to the right when imagined at the upright.

The stimuli used for the rotated left- and right-facing object discriminations are required to be asymmetric around the object’s vertical axis. It is our semantic knowledge of these objects that define one pole as the object’s front and the opposite pole as the object’s back. We suggest that it is access to this semantic information that leads to the objects being rotated in the direction of facing and that rotating a stimulus in the direction of facing may be a strategy that only generalizes to stimuli that have a defined front and back and not necessarily to all objects that are asymmetric around their vertical axes. For example, unlike objects, letters do not have commonly recognized fronts and backs, despite having a normal direction of facing. Given that letters have a direction of facing, it could be that a letter’s front is defined by the side that contains the most prominent and distinguishing features. For example, the front of a normal version of the letter “R” would be on the right-hand side where the loop and diagonal line are found while a “J” would have the front on the left. When deciding if a rotated letter would be in its normal or mirror-image version if imagined at the upright, the resulting response time function can be asymmetric, but not to the same extent as when deciding if a rotated object would face to the left or to the right if imagined at the upright (Searle & Hamm, 2012). Searle and Hamm (2012) noted that the letter “J” did not show a reversal of the asymmetry relative to rightward-facing letters, such as “R” (see their Footnote 1). However, with the distinctive feature of a normal version of the letter “J” being low and on the left, rotating about the center of the stimulus by leading with this feature would result in the same clockwise rotation as for “R” and the other right-facing letters.

To examine whether or not these letters tend to be rotated in the direction of facing or through the shortest distance to the upright, we took the response times of 24 participants, as collected and previously published by Searle and Hamm (2012), and predicted individuals’ response times to decide whether rotated letters were in their normal or mirror-image versions, when imagined at the upright, using the Direction of Facing model and the Shortest Distance model. Based on the participants’ data from Searle and Hamm’s (2012) study, the response times predicted by the Direction of Facing model resulted in an average goodness of fit value of 86.6 %. The response times predicted by the Shortest Distance model resulted in an average goodness of fit value of 87.3 %. Moreover, the Akaike Information Criterion value (Burnham & Anderson, 2002) including the correction for a small-sample bias, was lower for the Shortest Distance model (2626) compared to for the Direction of Facing model (2666), indicating preference for rotation through the Shortest Distance model. Eleven participants showed higher goodness of fits with the Shortest Distance model, while 13 participants showed higher goodness of fits with the Direction of Facing model. This indicates that, although overall the Shortest Distance model was preferred over the Direction of Facing model, some participants tended to rotate the letters in the direction that the features of the letters appeared to be facing. This re-analysis of the data from Searle and Hamm’s (2012) study demonstrates that both the Shortest Distance and Direction of Facing versions of the Mixture Model can be compared in the mirror/normal rotated letter discrimination task as well and that individual differences in the direction of mental rotation can be captured. It remains to be seen if, to make decisions about whether rotated letters are in their normal or mirror-image versions, the tendency to rotate through the shortest distance or in the direction of facing is a stable individual strategy and what may influence this decision.

The current study found that baseline response times and orientation effects did not differ between males and females, indicating that one or more of the processes prior to, or subsequent to, mental rotation and speed of mental rotation itself did not differ between males and females. Jansen-Osmann and Heil (2007) also found that the speed of mental rotation did not differ between males and females using similar object stimuli, although in their task participants were required to decide if two presented objects were the same or different. The current study found that the exponent values were larger overall for females than for males, suggesting that females were less reliant on a mental rotation strategy to decide if rotated objects would face to the left or to the right when imagined at the upright. This finding is consistent with research suggesting that strategy differences exist on mental rotation tasks when comparing the performances of males and females (Butler et al., 2006; Heil & Jansen-Osmann, 2008; Hirnstein, Bayer, & Hausmann, 2009). While these sex difference findings are interesting, they are presented with caution as the group sizes were small when comparing the parameters of males and females.

In conclusion, we suggest that the asymmetric response time function found when discriminating left- and right-facing rotated objects is due to mentally rotating the objects in the direction of facing, even if this rotation does not follow the shortest path to the upright. The front of an object determines its direction of facing and therefore its path of rotation. This indicates that the front of an object must be located prior to mental rotation. Moreover, semantic information about common movements of an object is also known prior to mental rotation and, while it does not influence the speed of rotation, this knowledge does influence the probability of employing mental rotation whereby mental rotation is employed more frequently if an object is typically known to move rather than remain stationary. These findings are consistent with the notion that objects are identified, to some extent, prior to mental rotation (Corballis, 1988; Hamm & McMullen, 1998). The influence of the direction of facing may be due to the task requiring discriminations between left- and right-facing objects and would not necessarily be something to expect to influence performance in other mental rotation tasks with different task demands, such as those requiring judgments about whether two stimuli are the same or different. We have proposed that individuals rotate objects in the direction of facing and, since this only affects the distance rotated, no additional participant parameters are required when adjusting Searle and Hamm’s (2012) adaption of Kung and Hamm’s (2010) Mixture Model.