Introduction

Communication between air traffic controllers and flight crews primarily involves giving and receiving navigation instructions. Errors often occur under these circumstances, and although they are usually caught and corrected, they sometimes lead to incidents and accidents (for reviews see Barshi, 1997; Tiewtrakul & Fletcher, 2010). We have been studying this communication situation with the eventual goal to determine ways to reduce such critical errors (Barshi & Healy, 1998, 2002; Healy, Schneider, & Barshi, 2009; Schneider, Healy, & Barshi, 2004; Schneider, Healy, Barshi, & Kole, in press). In particular, we have been investigating factors influencing subjects’ ability to follow navigation instructions in a laboratory paradigm (developed by Barshi, 1997) that is meant to mimic communication between air traffic controllers and flight crews. In that paradigm, subjects hear navigation instructions (e.g., “turn left two squares, climb up one level”), repeat them aloud, and then follow them by manually navigating within a space depicted as a diagram on a two-dimensional computer screen (see Fig. 1). The three-dimensional space is shown as four two-dimensional matrices stacked one on top of another.

Fig. 1
figure 1

Sample display. On the right is the display that the subject sees in the experiment; on the left is the three-dimensional space it represents

In aviation communication, when air traffic controllers issue navigation instructions to pilots, the pilots are expected to repeat (“readback”) those instructions prior to following them. In Barshi and Healy (2002), we showed that the accuracy of subjects’ oral repetition responses (readback) in our experimental paradigm depended on the nature of the space to which the navigation instructions applied, even when the same instructions were used for different navigational spaces. That is, immediate memory for the spoken commands depended on the way those commands were interpreted by the subjects. Specifically, by varying how the subjects were to interpret the words “up” and “down,” we led subjects to view the navigation instructions either as involving all three dimensions depicted in the space by moving across multiple matrices, or as involving only two of the three dimensions by moving only within a single matrix. We found that the accuracy of the subjects’ oral repetition responses was worse in the multi-matrix condition than in the single-matrix condition, and we argued that the difference between the two matrix conditions was due to the fact that the verbal representation underlying the immediate repetition responses depends on the spatial representation of the space within which the navigation task is performed.

An assumption in this work was that subjects would form a mental model (Johnson-Laird, 1983) of the navigational space, and map onto that spatial representation the verbal navigation instructions heard. The assumed spatial representation is similar to van Dijk and Kintsch's (1983) "situation model" (see also, Kintsch, 1988; Zwaan & Radvansky, 1998) in that it is constructed on the basis of the given situation and is used in the interpretation of the language input. However, even if this characterization is accurate, different aspects of constructing that mental representation could be responsible for the observed disadvantage for the multi-matrix condition relative to the single-matrix condition. Our current study explores three such possible aspects: (a) the lack of depth cues, (b) the use of diagrams, and (c) the role of experimental instructions.

The lack of depth cues

The role played by the presence of depth cues in the perception and representation of space has been the subject of much debate (e.g., Kerr, 1993). Kerr found no special role for the depth dimension in determining difficulty of following navigation instructions; instead, the number of dimensions was a determining factor, as specified in the theory of Gibson (1979). In contrast, Bryant and Tversky (1999) found that depth cues played a crucial role in memory for depictions of scenes. In Barshi and Healy (2002), we found that when subjects were led to view navigation instructions as occurring in a three-dimensional space (involving multiple matrices), their memory for the instructions, as well as their ability to follow them, was worse than when the subjects were led to view the same instructions as occurring in a two-dimensional space (involving a single matrix). To determine whether the lack of depth cues in the display of the navigational space was crucial for this effect, we directly manipulated the presence of depth cues by comparing diagrams with and without perspective (see Fig. 2) in the present Experiment 1. In addition, in the present Experiment 2, we also use a three-dimensional model, providing full depth information, to depict the navigational space.

Fig. 2
figure 2

Sample display in perspective condition

If the lack of depth cues has sole responsibility for the difference between the multi-matrix and single-matrix conditions in Barshi and Healy (2002), then the effect should be eliminated with the introduction of depth cues.

The use of diagrams

The use of a three-dimensional model as opposed to a two-dimensional diagram of the navigational space has in itself been shown to have an important impact on the mental representation of the space depicted. For example, Ittelson (1996) argued that the perception of markings (roughly, two-dimensional patterns that carry meanings), and in particular pictures, is very different from that of the real world. However, even though he argued that scenes depicted in flat diagrams are perceived differently from the way in which the corresponding real-world scenes are perceived, it is possible that the mental representations resulting from these different perceptions of diagrams and scenes are nonetheless identical (see, e.g., Klatzky, Wu, & Stetten, 2008; and Loomis, Klatzky, Avraamides, Lippa, & Golledge, 2007, for evidence of amodal spatial representations). Contrary to this possibility, Bryant and Tversky (1999) showed that the spontaneous representation arising when viewing a diagram is indeed different from that arising when viewing a model of the same scene. To determine whether the use of diagrams was crucial for the effect found by Barshi and Healy (2002), we compared two-dimensional diagrams of the navigational space with a three-dimensional model of that space in the present Experiment 2.

If the use of two-dimensional diagrams has sole responsibility for the difference between the multi-matrix and single-matrix conditions in Barshi and Healy (2002), then the effect should be removed with the introduction of a three-dimensional model.

The role of experimental instructions

Bryant and Tversky (1999) emphasized that it is not the depictions themselves that determine the mental representation but rather the interpretations of the depictions, and that such interpretations can be influenced by the experimental instructions given. They also argued that individuals are able to activate one mental representation or another depending on the viewpoint they choose (e.g., in response to specific experimental instructions). This ability to select intentionally a mental representation of space is consistent with Naveh-Benjamin's (1987) argument that the coding of spatial information is not necessarily an automatic process. The potential impact of experimental instructions on mental representation was perhaps demonstrated most clearly in the research of Kotovsky, Hayes, and Simon (1985), which showed a huge impact on the ability to solve isomorphic problems as a function of how the problems were framed in the experimental instructions (see Huttenlocher & Presson, 1979; Presson, 1982; and Wraga, Creem, & Proffitt, 1999, 2000, for other examples of the influence of instructions on performance in tasks requiring spatial cognition). To determine whether the specific experimental instructions about how to view the navigational space were crucial for the effect found by Barshi and Healy (2002), we introduced in the present Experiment 3 a new set of experimental instructions that did not promote a three-dimensional mental representation of the space within which the navigational task was to be performed.

If experimental instructions that promote a three-dimensional representation have sole responsibility for the difference between the multi-matrix and single-matrix conditions in Barshi and Healy (2002), then the effect should not be found with experimental instructions that do not promote a three-dimensional representation.

Thus, our present study involves three experiments designed to explore three important aspects of the experimental situation that might be responsible for the difference between the multi-matrix and single-matrix conditions observed in the study by Barshi and Healy (2002)—the lack of depth cues, the use of diagrams, and the role of experimental instructions—because of their possible influence on the subjects’ construction of a mental representation of the navigational space. What is theoretically at stake here is the critical conclusion that the verbal representation of navigational instructions depends on the mental representation of the navigational space. If the difference between the two matrix conditions does not depend on any one of these three important aspects of the experimental situation, the finding would be of greater generality or external validity and the critical conclusion would not be limited to a specific experimental situation.

Experiment 1

Barshi and Healy (2002) demonstrated the interdependence of the verbal and spatial representations of navigation instructions by comparing two matrix conditions, both of which involved movements along only two directions, and the same verbal navigation instructions were used in both conditions. The specific difference between the two conditions was in how the words up and down were to be interpreted by the subjects. In one condition (multi-matrix), commands to go up and down led to movements from one matrix to another, whereas in the other condition (single-matrix) the same commands led to movements within just one matrix. There were large differences between the two matrix conditions, both in terms of the manual movement responses (subjects’ ability to execute the navigation instructions by clicking on the appropriate locations on the grid) and, importantly, in terms of the immediate oral repetition responses (subjects’ ability to read back the navigation instructions). We attributed these results to the assumption that movement from one matrix to another requires a more complex spatial representation, and that the complexity of this spatial representation affects the verbal representation as well as the performed movements.

Experiment 1 was a replication of the experiment by Barshi and Healy (2002) with an added between-subjects variable of display (flat, perspective). If the difference we found between the matrix conditions of the previous experiment was due to the lack of depth cues, then introducing depth cues by using a diagram drawn towards a vanishing point should reduce or eliminate that difference.

Barshi and Healy (2002) included two additional manipulations, one of number of commands and the other of command wordiness. Each message consisted of one to six commands, and each command, which provided information about the direction and extent of movement, was phrased with either two words (e.g., up two) or four words (e.g., climb up two levels). The additional words in the four-word commands provided no additional information and thus did not add to the complexity of the message (Prinzo, Hendrix, & Hendrix, 2006). Barshi and Healy found huge effects on memory of the number of commands but very small effects of the number of words within each command. Employing these two manipulations in our present experiment allowed us to replicate this earlier finding demonstrating that the mental representation of the messages is based on propositional units rather than on words (Kintsch & Keenan, 1973) and that the capacity limit of working memory is about three propositional units.

We examine both the oral repetition responses, which provide an index of immediate memory for the verbal instructions, and the manual movement responses, which provide an index of execution accuracy. Finding an effect of matrix condition on oral repetition responses is of greatest theoretical interest. The effect of matrix condition on manual movement responses is used here primarily to confirm the findings from the oral repetition responses.

Method

Subjects heard messages telling them where to move in a space shown on a computer screen. The navigation instructions were limited to commands to turn right or left and to climb up or down. Subjects repeated the instructions aloud and then used their computer mouse to follow the instructions by clicking in the space displayed on the computer screen.

Subjects

Forty-eight undergraduate students at the University of Colorado, Boulder, participated for credit in a course in introductory psychology. All subjects were native English speakers.

Design

The computer screen displayed a grid of four stacked 4X4 matrices (see the right side of Fig. 1). Subjects were instructed that the grid represents a three-dimensional space (see the left side of Fig. 1) and were also shown a small three-dimensional model representing that space. The constant starting position was a filled-in square. Only the numbers one, two, and three were used in the commands. None of the commands led subjects to “fall off” the grid. There was a consistent structure to the commands: turn (left, right) (one, two, three) square(s), followed by climb (up, down) (one, two, three) level(s). Also, turn was always used with movement right or left a number of squares, and climb was always used with movement up or down a number of levels. In addition, the commands always occurred in a fixed, alternating order, with turn always preceding climb.

Two different matrix conditions were compared, which used the same verbal navigation instructions but required different executions of these instructions. Specifically, the conditions varied in how the subjects were to move given the commands to climb up or down a certain number of levels. In the multi-matrix condition, subjects moved from one matrix to another, whereas in the single-matrix condition, subjects moved from one row to another within one matrix.

The messages included one to six commands. For example, a message with six commands was “turn left one square, climb up two levels, turn right three squares, climb down two levels, turn left three squares, climb up two levels.” The words turn and climb and the words squares and levels were unnecessary. To test for any effects of wordiness, half of the messages included wordier four-word commands, and half included minimal two-word commands that contained only the critical words (e.g., "left one, up two, right three, down two, left three, up two"). Each combination of number of commands and wordiness was used equally often.

Two different displays were compared: the flat display used by Barshi and Healy (2002) (see Fig. 1) and a new perspective display (see Fig. 2). The navigation instructions were the same for the two computer displays.

There was a total of 72 experimental trials, which were divided into six 12-trial blocks, with every block including one trial of each combination of number of commands and wordiness in a pseudorandom order. Across the full set of 72 trials, in a given serial position each possible command (e.g., “turn left one square”) was used the same number of times in each combination of number of commands and wordiness. Before the experimental trials, subjects were given verbal experimental instructions about how to follow the commands as well as a demonstration by the experimenter with a small three-dimensional model. In addition, a one-trial animated demonstration was presented on the computer followed by 12 practice trials, which included one message at each combination of wordiness and number of commands in an order by which predicted difficulty increased systematically. Unlike the experimental trials, subjects were given feedback on each of the practice trials. Also, the practice trials included a different starting point from the experimental trials; the starting points were mirror images of each other.

Apparatus and materials

Two different computer displays were used: flat and perspective. The auditory stimuli were identical in all four between-subjects conditions. The commands were spoken by a male native English speaker using as natural a manner as possible. The speaker’s voice had been digitized on a Macintosh SE30 computer with the program SoundEditPro. The clearest sample of each word had been spliced out of the speech stream. The appropriate words were then played by a Macintosh II computer. Thus, the natural stress and intonation pattern were preserved within a given word but not over an entire command.

Procedure

Subjects heard messages with one to six commands and two or four words per command. Each message was followed by a beep. After the beep was heard, the subjects’ first task was to repeat the message aloud and then click with the computer mouse a button labeled DONE (see Fig. 3). The subjects’ oral repetition responses were audio taped. Their next task was to follow the navigation instructions by clicking with the mouse on the appropriate squares in the grid. To turn right or left, they were to move horizontally within the same matrix. To climb up or down, they were to move vertically within the same matrix in the single-matrix condition, but to move vertically to a different matrix in the multi-matrix condition. For example, in response to the command in the single-matrix condition "climb down one level," subjects were to click on the box immediately below the one they were on in the same matrix (see the numeral 3 in Fig. 3 top panel). In response to the same command, subjects in the multi-matrix condition were to click on the same box as the one they were on in the matrix below the one they were on (see the numeral 3 in Fig. 3 bottom panel). Subjects were required to click every box they passed. Thus, subjects were to make the same number of clicks as the number in the instructions (i.e., “two squares” or “two levels” = two clicks, “three squares” or “three levels” = three clicks). Separating trials was a 2 s pause.

Fig. 3
figure 3

Sample instructions and movements in the single-matrix condition (top panel) and in the multi-matrix condition (bottom panel). Note: The subjects did not see the instructions. Also, the numerals shown here were not seen by the subjects. The starting point is the filled-in square

Analyses

A mixed factorial analysis of variance (ANOVA) was conducted for the oral repetition responses as well as for the manual movement responses. Each ANOVA included the between-subjects factors of display (flat, perspective) and of matrix condition (single, multi) and the within-subjects factors of wordiness (minimal, wordier) and number of commands (1 to 6). Each analysis involved a strict scoring procedure by which a trial was scored as correct only when all of the responses for the trial were correct. If the subject missed one of the two critical words in a command or said the wrong critical word, the trial was scored as an error for the oral repetition responses. Likewise, if the subject missed a click or made an incorrect click, the trial was scored as an error for the manual movement responses. It should be noted that the extra words in the wordier messages (i.e., turn, climb, square(s), level(s)) were not included in the scoring so that the same information was required for the repetition of minimal and wordier messages.

Results

Oral repetition responses

The results for the oral repetition responses are summarized in Table 1. There was no main effect of display and no reliable interactions involving display.

Table 1 Proportion of errors on oral repetition responses in Experiment 1 as a function of display, matrix condition, wordiness, and number of commands

Importantly, there was a reliable main effect of matrix condition, F(1, 44) = 18.16, MSE = 0.90, p < 0.001, η2 = 0.292. Subjects made more errors overall in the multi-matrix condition (0.38) than in the single-matrix condition (0.24). Matrix condition also interacted reliably with number of commands, F(5, 220) = 14.01, MSE = 0.18, p < 0.001, η2 = 0.242, because the advantage of the single-matrix condition over the multi-matrix condition increased as the number of commands increased (see Fig. 4), perhaps because of the corresponding increase in the number of matrix shifts. Recall that there was an alternation of “left/right” and “up/down” commands and only “up/down” requires a matrix shift and only in the multi-matrix condition; hence, there were zero, one, one, two, two, and three matrix shifts in the multi-matrix condition as the number of commands increased from one to six, respectively.

Fig. 4
figure 4

Proportion of errors as a function of matrix condition and number of commands for the oral repetition responses in Experiment 1. Error bars represent standard errors of the mean

There was also a reliable main effect involving number of commands. As in all previous experiments using this task (e.g., Barshi & Healy, 2002), errors increased dramatically and monotonically with number of commands, F(5, 220) = 261.30, MSE = 0.18, p < 0.001, η2 = 0.856. On the other hand, there was no reliable main effect of wordiness. However, there was a reliable interaction involving wordiness and number of commands, F(5, 220) = 4.93, MSE = 0.11, p < 0.001, η2 = 0.101, because wordier messages had an advantage only for the messages with a large number of commands.

Manual movement responses

The results for the manual movement responses are summarized in Table 2. There was no main effect of display (F < 1). Furthermore, there were no reliable two-way interactions involving display.

Table 2 Proportion of errors on manual movement responses in Experiment 1 as a function of display, matrix condition, wordiness, and number of commands

Again, importantly, there was a reliable main effect of matrix condition, F(1, 44) = 32.60, MSE = 0.83, p < 0.001, η2 = 0.426. Subjects made more errors overall in the multi-matrix condition (0.38) than in the single-matrix condition (0.21). Matrix condition also interacted reliably with number of commands, F(5, 220) = 14.29, MSE = 0.18, p < 0.001, η2 = 0.245, because the difference between matrix conditions increased with increases in the number of commands. In addition, the three-way interaction involving display, matrix condition, and number of commands was reliable, F(5, 220) = 3.17, MSE = 0.18, p < 0.009, η2 = 0.067, because the increase in the difference between the two matrix conditions with the increase in the number of commands was greater for the flat display than for the perspective display (see Fig. 5). As discussed in the summary below, this interaction is important because it shows that the effect of matrix condition with a large number of commands is diminished with the addition of the depth cues provided in the perspective condition.

Fig. 5
figure 5

Proportion of errors as a function of matrix condition and number of commands for the manual movement responses in each display of Experiment 1. Error bars represent standard errors of the mean

There was a reliable main effect of wordiness, F(1, 44) = 4.95, MSE = 0.13, p < 0.030, η2 = 0.101, because individuals erred more on the minimal messages (0.31) than they did on the wordier messages (0.28). Number of commands also had a reliable effect: Errors increased steeply as the number of commands increased, F(5, 220) = 218.96, MSE = 0.18, p < 0.001, η2 = 0.833. In addition, there was a reliable interaction involving wordiness and number of commands, F(5, 220) = 3.02, MSE = 0.10, p < 0.012, η2 = 0.064, because the advantage of the wordier messages was greater for the longer messages than for the shorter messages.

Summary

This experiment showed an effect of matrix condition even for the perspective display. Thus, the replicated advantage of the single-matrix condition over the multi-matrix condition (even in the oral repetition responses!) cannot be attributed to the lack of depth cues.

Nevertheless, there was a significant three-way interaction involving display, matrix condition, and number of commands (in the analysis of the manual movement responses), and that interaction is particularly revealing. The pattern in Fig. 5 shows that the addition of the spatial (depth) cues in the perspective display resulted in improved performance (lower error proportions) on the longer messages in the multi-matrix condition compared to performance on messages with the same number of commands in the multi-matrix condition with the flat display, but resulted in worse performance on those long messages in the single-matrix condition with the perspective display compared to performance on messages with the same number of commands in the single-matrix condition with the flat display. This interaction is consistent with a mental representation of the space in which the navigation task occurs, because the depth cues support a three-dimensional mental representation (necessary for the multi-matrix condition), but do not support, and hence may interfere with, a two-dimensional representation (presumably used in the single-matrix condition).

We also replicated the important findings of Barshi and Healy (2002) concerning the effects of number of commands and command wordiness. Specifically, there was a huge effect of number of commands, with performance greatly decreasing as the number of commands increased, but a much smaller effect of wordiness, with a slight advantage for the wordier over the minimal messages with a large number of commands.

Because both display conditions were presented on a flat computer screen as two-dimensional diagrams, the use of diagrams might be crucial to obtaining the effect of matrix condition. Experiment 2 was designed to test this possibility.

Experiment 2

Experiment 2 was a replication of Experiment 1 using physical displays of the space within which movement was to take place instead of displaying the three-dimensional space on a two-dimensional computer screen. If the advantage we found for the single-matrix condition of Experiment 1 was dependent on the use of two-dimensional diagrams, then using a three-dimensional model should eliminate the difference between the two matrix conditions. Hence, Experiment 2 compared three physical displays: a paper printout of the flat display, a paper printout of the perspective display, and a three-dimensional model of the space. To follow the navigation instructions, subjects had to touch the appropriate squares with their finger.

Because the most important finding is the effect of matrix condition on immediate recall of the verbal commands as measured by the oral repetition responses, and because in most of the previous experiments we found no reliable overall differences in performance between the manual movement responses and the oral repetition responses, we analyzed here only the oral repetition responses. Additionally, because in the previous experiments we found no consistent main effect of wordiness, we did not include the wordiness manipulation in this experiment. Furthermore, because we consistently found floor effects for the very short messages of one command and ceiling effects for the very long messages of six commands (floor and ceiling in terms of proportion of errors), in this experiment we used only messages of one, two, three, four, and five commands. As in previous experiments, Experiment 2 included the within-subjects variable of number of commands as well as the between-subjects variable of matrix condition.

Method

Subjects

Seventy-two undergraduate students at the University of Colorado, Boulder, participated for credit in a course in introductory psychology. All subjects were native English speakers.

Design

The design was the same as in Experiment 1, except that there were three displays (flat, perspective, model) instead of just two, there were four numbers of commands (2 to 5) instead of six, and there were only wordier messages instead of both wordier and minimal messages.

Apparatus and materials

The apparatus was a paper printout for the flat and perspective groups and a three-dimensional model for the model group (see Fig. 6 top panel for the model alone and bottom panel for a subject responding on the model).

Fig. 6
figure 6

Model display used (top panel) and subject responding to instructions (bottom panel) in Experiment 2

The materials were the same as used in Experiment 1, except that the stimulus set was shortened from 72 trials that included six different numbers of commands (from 1 to 6 commands) as well as wordier and minimal messages, to 24 trials that included four different numbers of commands (from 2 to 5 commands), one in each of the six blocks of four trials, and wordier messages only. The navigation instructions were identical in all conditions.

Procedure

Subjects heard navigation instructions, repeated the instructions aloud, and then followed the instructions by touching their index finger to the appropriate squares on the physical displays. Subjects' oral repetition responses were audio taped, and their manual movement responses were videotaped.

Analyses

A multifactorial repeated measures ANOVA was conducted for the oral repetition responses. The ANOVA included the between-subjects factors of display (flat, perspective, model) and of matrix condition (single, multi) and the within-subjects factor of number of commands (2 to 5). The analysis involved the strict all-or-nothing scoring procedure.

Results

The results for the oral repetition responses are summarized in Table 3. As in Experiment 1, there was no main effect of display and no reliable interactions involving display. Again, most importantly, there was a reliable main effect of matrix condition, F(1, 44) = 7.76, MSE = 0.49, p < 0.007, η2 = 0.150. Subjects made more errors overall in the multi-matrix condition (0.31) than in the single-matrix condition (0.22). Matrix condition also interacted reliably with number of commands, F(3, 198) = 5.45, MSE = 0.19, p < 0.002, η2 = 0.076, because the advantage of the single-matrix condition over the multi-matrix condition increased as the number of commands increased (see Fig. 7).

Table 3 Proportion of errors on oral repetition responses in Experiment 2 as a function of display, matrix condition, and number of commands
Fig. 7
figure 7

Proportion of errors as a function of matrix condition and number of commands for the oral repetition responses in Experiment 2. Error bars represent standard errors of the mean

There was also a reliable main effect involving number of commands. Errors increased dramatically and monotonically with increases in the number of commands, F(3, 198) = 120.15, MSE = 0.19, p < 0.001, η2 = 0.645.

Summary

Experiment 2 underscored the crucial role played by the mental representation of space in the comprehension of verbal instructions pertaining to this navigation task. The fact that we found an effect of matrix condition even when we used a three-dimensional model of the space and that we did not find an effect of display shows that the use of two-dimensional diagrams is not responsible for the effect of matrix condition. This finding implies that the mental representation is comparable for a three-dimensional physical model and a two-dimensional diagram (cf. Bryant & Tversky, 1999). Furthermore, the significant interaction involving matrix condition and number of commands and the fact that we found no reliable interactions involving display demonstrate that in all display conditions the advantage of the single-matrix condition over the multi-matrix condition is largest for the longer messages.

Experiment 3

In Experiments 1 and 2 we found an effect of matrix condition on immediate memory for verbal navigation instructions even when depth cues were used and when a three-dimensional model was used, rather than a two-dimensional diagram. As mentioned earlier, Bryant and Tversky (1999) suggested that it is not the depictions themselves that determine the mental representation but rather the interpretations of the depictions, and that such interpretations can be influenced by the experimental instructions given. Thus, in Experiment 3 we varied the experimental instructions given to include a condition that did not promote the construction of a three-dimensional mental representation along with the condition using the same experimental instructions as in the earlier experiments, which did promote a three-dimensional mental representation. Specifically, in one set of experimental instructions (model), we showed the subjects a three-dimensional model of the space, and we made it clear to the subjects that the computer display represents that model, as in our previous experiments. In contrast, in the other set of experimental instructions (board), we showed the subjects a two-dimensional piece of paper containing a copy of the computer display, and we told the subjects that the display represents four checkerboards. Thus, the board instructions do not include a three-dimensional representation of the four matrices, and the notion of a checkerboard provides a good cover story for why the subjects would have to move into the same position on a different matrix, without using the three-dimensional model. This experiment included the single-matrix and multi-matrix manipulation between subjects. Also, as in Experiment 1, in this experiment we once again included all six numbers of commands, which in this case involved minimal messages that did not refer to three-dimensional movement (e.g., “left two, up one”), and we examined subjects’ manual movement responses as well as their oral repetition responses. Because the new instructions might lead to different results from those obtained with the original messages, we thought it was important to use the full range of the number of commands and to confirm again that the pattern for manual movement responses largely matches that for oral repetition responses.

If the difference between the two matrix conditions depends on experimental instructions that promote the construction of a three-dimensional mental representation, then we should find this difference only for the model set of instructions, not for the board set because only the model involves three dimensions.

Method

Subjects

Forty-eight undergraduate students from the University of Colorado, Boulder, participated for credit in an introductory psychology class. All students were native English speakers.

Design

The design was the same as in Experiment 1 except that instead of two displays there were two instruction groups (board, model), and the variable of wordiness was no longer manipulated, with all messages of minimal wordiness.

Apparatus, materials, and procedure

The same apparatus, materials, and procedure were used as in Experiment 1 except that we used only the minimal messages. Specifically, there were 36 experimental trials divided into six 6-trial blocks, and each block of six trials included one trial of each number of commands. The same messages were used in each of the two instruction groups, and the same manual movements were required in each case.

The two groups of subjects were given different sets of experimental instructions. In the model instructions, as in Experiments 1 and 2 and in the experiments by Barshi and Healy (2002), subjects were shown a three-dimensional model of the space, and were told that the computer display represented that model. In contrast, in the board instructions, subjects were shown a two-dimensional piece of paper illustrating the computer display, and they were told that the display represented four checkerboards.

Analyses

A mixed factorial ANOVA was conducted for the oral repetition responses as well as for the manual movement responses. Each ANOVA included the between-subjects factors of instruction group (board, model) and of matrix condition (single, multi) and the within-subject factor of number of commands (1 to 6). Each analysis involved the strict scoring procedure.

Results

Oral repetition responses

The results for the oral repetition responses are summarized in Table 4. Importantly, as found in Experiments 1 and 2, the error proportion was lower overall in the single-matrix condition (0.29) than in the multi-matrix condition (0.39); the main effect of matrix condition was significant, F(1, 44) = 13.13, MSE = 0.34, p < 0.002, η2 = 0.230. The error proportion was significantly lower overall for the board instruction group (0.30) than for the model instruction group (0.37); the main effect of instruction group was significant, F(1, 44) = 5.71, MSE = 0.34, p < 0.021, η2 = 0.115. Importantly, effects of matrix condition were found for both the board instruction group (single-matrix: 0.26, multi-matrix: 0.35) and the model instruction group (single-matrix: 0.32, multi-matrix: 0.43); the interaction of matrix condition and instruction group was not significant, F < 1.

Table 4 Proportion of errors on oral repetition responses in Experiment 3 as a function of instruction group, matrix condition, and number of commands

Also, the error proportion increased monotonically as the number of commands increased, F(5, 220) = 273.48, MSE = 0.12, p < 0.001, η2 = 0.861. The advantage for the single-matrix condition varied as a function of the number of commands (see Fig. 8 top panel), largely because of the floor in error proportion at the two shortest numbers of commands; the interaction of matrix condition and number of commands was significant, F(5, 220) = 6.08, MSE = 0.12, p < 0.001, η2 = 0.121.

Fig. 8
figure 8

Proportion of errors for oral repetition responses (top panel) and manual movement responses (bottom panel) as a function of number of commands and matrix condition in Experiment 3. Error bars represent standard errors of the mean

Manual movement responses

The results for the manual movement responses are summarized in Table 5. Again, most important is the fact that the error proportion was lower overall in the single-matrix condition (0.31) than in the multi-matrix condition (0.44), F(1, 44) = 19.71, MSE = 0.41, p < 0.001, η2 = 0.309. Although there was no significant main effect of instruction group, there was a significant interaction of instruction group and number of commands, F(5, 220) = 2.73, MSE = 0.14, p < 0.021, η2 = 0.058, reflecting the fact that there was a lower error rate for the board instruction group than for the model instruction group but only for messages with three, four, and six commands (see Fig. 9). Importantly, the two instruction groups showed equivalent effects of matrix condition; the interaction of matrix condition and instruction group was not significant, F < 1.

Table 5 Proportion of errors on manual movement responses in Experiment 3 as a function of instruction group, matrix condition, and number of commands
Fig. 9
figure 9

Proportion of errors for manual movement responses as a function of number of commands and instruction group in Experiment 3. Error bars represent standard errors of the mean

Also, as in previous experiments, the error proportion increased monotonically as the number of commands increased, F(5, 220) = 245.11, MSE = 0.14, p < 0.001, η2 = 0.848. Largely because of the floor in error proportion at the shortest number of commands, the advantage for the single-matrix condition varied as a function of the number of commands (see Fig. 8 bottom panel); the interaction of matrix condition and number of commands was significant, F(5, 220) = 6.67, MSE = 0.14, p < 0.001, η2 = 0.132.

Summary

The effect of matrix condition was found in this experiment even with experimental instructions that did not promote a three-dimensional mental representation. However, the significant main effect of instruction group on the oral repetition responses shows that there is a cost involved in constructing a three-dimensional representation.

General discussion

Barshi and Healy (2002) found that immediate recall of verbal navigation instructions depended on the way the subjects interpreted the commands, leading the authors to the critical conclusion that the verbal representation of navigational instructions depends on the mental representation of the navigational space. Specifically, subjects were led to view the navigation instructions either as involving all three dimensions depicted in the space by moving across matrices or as involving only two of the three dimensions by moving only within a single matrix, and subjects were more accurate in their recall responses in the single-matrix condition than in the multi-matrix condition. In our present study, we extended and elaborated these results beyond this earlier finding by examining three aspects of the experimental procedure that might be responsible for the effect of matrix condition because of their influence on the construction of the subjects’ mental representation of the space: (a) the lack of depth cues, (b) the use of diagrams, and (c) the role of experimental instructions. We found an effect of matrix condition on the oral repetition responses even when we added depth cues to the display, even when the display consisted of a three-dimensional model instead of a two-dimensional diagram, and even when the experimental instructions did not promote a three-dimensional representation, thus enhancing the generality or external validity of this finding and implying that the critical conclusion is not limited to a specific experimental situation. Although the effect of matrix condition occurred in all experimental conditions, those conditions did influence performance in some important ways, thereby providing additional insight into the nature of the underlying mental representations of the space.

The significant three-way interaction of display, matrix condition, and number of commands on manual movement responses in Experiment 1 provides evidence that subjects relied on a three-dimensional mental representation of the space in the multi-matrix condition but not in the single-matrix condition because on the longer messages the depth cues aided performance in the multi-matrix condition but impaired performance in the single-matrix condition. Furthermore, the significant main effect of instruction group on oral repetition responses (along with the significant interaction of instruction group and number of commands in the manual movement responses) in Experiment 3 demonstrates that the three-dimensional mental representation is difficult to construct because performance on the long messages was worse with experimental instructions that promoted a three-dimensional representation than with experimental instructions that did not promote such a representation.

This study, thus, shows that the lack of depth cues, the use of two-dimensional diagrams, and the experimental instructions are not responsible for the effect of matrix condition. That effect shows a substantial performance advantage in memory for verbal navigation instructions that pertain to movements within a single matrix when compared with movements across matrices. What factors then are responsible for that effect?

One way to view the effect of matrix condition, which was suggested in Barshi and Healy (2002), concerns the fact that in the multi-matrix condition, but not in the single-matrix condition, subjects must move outside the picture plane. An alternative way to view the effect of matrix condition is, more simply, in terms of the adjacency of the required movements. In the single-matrix condition, all movements are to adjacent cells, whereas in the multi-matrix condition up and down movements are to cells that are not adjacent in the depicted display. Although it might not be surprising that manual movements are more accurate when only adjacent cells are involved, it is noteworthy that the immediate recall of the verbal instructions is affected by the adjacency of the required movements. Regardless, by both of these alternatives, the multi-matrix condition requires a more complex mental representation than does the single-matrix condition.

By array theories of imagery (e.g., Kosslyn, 1980), individuals directly perceive only two dimensions and, thus, to visualize three dimensions the third must be derived from the other two. By such theories, memory for instructions should be worse whenever they refer to a three-dimensional, rather than a two-dimensional, space. Support for this hypothesis is provided by the significant effect of matrix condition. These theories stand in contrast to sandbox theories of imagery (e.g., Attneave, 1972), according to which there should be no difficulty visualizing three dimensions as they are all perceived directly. Thus, our results do not support the sandbox view.

Array theories hold that depth is a critical dimension and that imagery processing would be slowed whenever depth is involved. However, the depth dimension was not involved in the movements required in the multi-matrix condition because movements were made only along the dimensions of width and height. The depth dimension was involved in the single-matrix condition, which showed a performance advantage over the multi-matrix condition. Thus, it is possible to interpret the present results as inconsistent with array theories but consistent with Gibson's (e.g., 1979) approach, which would predict greater difficulty when all three dimensions are included, rather than attributing a special status to any single dimension.

Further evidence that the number of dimensions included in the spatial representation affects performance, and thus implying that a three-dimensional representation is more complex than a two-dimensional representation, is provided by finding a disadvantage in Experiment 3 for the model instructions relative to the board instructions for the manual movement responses and more clearly for the oral repetition responses. A similar conclusion was drawn in a study by Westerman, Collins, and Cribbin (2005), who compared two- and three-dimensional computer displays given to subjects who were browsing for information in the displays. They found navigation through a two-dimensional space to be less effortful than that through a three-dimensional space. In addition, interestingly, they found that subjects used different search strategies under the two conditions, implying qualitative as well as quantitative differences contributing to complexity.

The results of our present experiments showed that the two-dimensional advantage found in Barshi and Healy (2002), even for oral repetition, was indeed an aspect of the mental representation. This finding challenges the suggestion by Lyon, Gunzelmann, and Gluck (2008) that subjects might repeat back instructions in this task without constructing a verbal representation that would be influenced by the spatial aspects of the task. It is clear from the results of our present study that the verbal representation of the navigation instructions was in fact influenced by the subjects’ spatial representation of their required movements.

Furthermore, the comparison of the physical three-dimensional model and the two-dimensional paper displays in Experiment 2 provides evidence against the claim (Bryant & Tversky, 1999; Ittelson, 1996) that representations arising when viewing a diagram are different from those arising when viewing a model of the same scene. On the basis of finding an effect of matrix condition for all of the different displays in our study, we suggest that the mental representation formed in our task is comparable for a three-dimensional physical model and a two-dimensional diagram.

The findings of our present study, together with Kerr's (1993) findings and our earlier work (Barshi & Healy, 2002), allow us to propose an approach that combines some of the notions of array theories with some of Gibson's (1979) notions. Specifically, we provide further evidence to support our proposal that what makes one spatial representation more complex and harder to maintain and manipulate than another spatial representation is the existence of a dimension that is outside the picture plane (regardless of which dimension it is). However, a comprehensive account must also address the overall number of spatial dimensions involved and a complexity limit that is independent of the number of spatial dimensions employed. The complexity limit would accommodate Kerr's (1993) results showing that a large two-dimensional matrix was harder to handle than a smaller three-dimensional matrix. The number of spatial dimensions employed would be needed to accommodate the performance disadvantage we found in the present Experiment 3 for the model instructions (which are based on a three-dimensional interpretation of the diagram) relative to the board instructions (which are based on a two-dimensional interpretation of the same diagram).

Subjects in this paradigm move through a small space with their fingers or a computer mouse. Nevertheless, the results have important practical implications for remembering navigation instructions in any size space. One of the most striking results of this entire line of research involves the large effects of number of commands, and the contrasting negligible effects of command wordiness. Specifically, performance declined steeply with increases in the number of commands, but there was little or no decline in performance with increases in wordiness (i.e., the number of words per command). For example, increasing the number of words from four to eight by increasing the number of commands from two minimal commands to four minimal commands reduced performance dramatically. However, the same increase in the number of words by going from two minimal messages to two wordier messages in fact significantly improved manual movement performance (in Experiment 1). These findings, which are consistent with those of other researchers investigating aviation communication (e.g., Prinzo et al., 2006; Tiewtrakul & Fletcher, 2010) have important implications for both theory and practice. In terms of theory, they suggest that propositional chunks rather than words comprise the contents of working memory and that working memory capacity in naturalistic settings is limited to about three chunks (e.g., Cowan, 2001). In terms of practical applications, they suggest that, to increase the likelihood of error-free performance, those giving navigation instructions should limit their messages to no more than three commands, but the number of words in each command is much less important.