Pattern formation entails the size-dependent, hierarchical organization of perceptual units. Relatively large global patterns are formed from smaller parts, each of which is formed from still smaller parts, and so on down to the smallest units that are represented in the visual system. Accordingly, Biederman (1987) argued that global patterns (i.e., objects) are formed from intermediate-level, three-dimensional components called geons, which, in turn, are formed from locally detected edges and surfaces. Hochstein and Ahissar (2002) characterized hierarchical processing in terms of receptive field size and, thus, different levels of neural processing; that is, global features are detected by larger receptive fields than are local features and, therefore, are processed further upstream in the brain. They argued that conscious perception begins at the global level, then “works downward" through feedback to local levels, where smaller receptive fields are responsive to relatively detailed stimulus information. Global-to-local feedback has likewise been implicated in visual attention (e.g., Ito & Gilbert, 1999; Motter, 1993; Spratling & Johnson, 2004), object recognition (e.g., Deco & Rolls, 2005), expectancy (Von Stein, Chiang, & König, 2000), and other perceptual phenomena (for a review, see Gilbert & Sigman, 2007).

The existence of neural feedback from higher to lower level brain areas is well established (Barbas & Rempel-Clower, 1997; Felleman & Van Essen, 1991; Maunsell & Van Essen, 1983). With regard to motion perception, there is both neurophysiological evidence and evidence from transmagnetic stimulation (TMS) for neural feedback from upstream, directionally selective motion detectors with relatively large receptive fields to lower level motion detectors with smaller receptive fields (Húpe et al., 1998; Muckli, Kohler, Kriegeskorte, & Singer, 2005; Pascual-Leone, Walsh, & Rothwell, 2000; Silvanto, Cowey, Lavie, & Walsh, 2005; Wibral, Bledowski, Kohler, Singer, & Muckli, 2009). There also is evidence for top-down effects of attention on motion processing (Treue & Martinez-Trujillo, 1999; Treue & Maunsell, 1996; Womelsdorf, Anton-Erxleben, Pieper, & Treu, 2006). However, there have been neither neurophysiological nor psychophysical reports regarding the influence of global-to-local feedback on hierarchical pattern formation for stimuli with multiple motion components. Such feedback would potentially establish a recurrent excitatory loop (Deco & Rolls, 2005; Gilbert & Sigman, 2007; Mumford 1992) that stabilizes the perceived global motion pattern by virtue of global-level detectors interactively increasing the activation of local motion detectors whose directional selectivity is consistent with the global pattern. Stimulus-initiated activation that is consistent with a particular global pattern would feed forward from local detectors to a more global level of processing, and global motion pattern detectors, when sufficiently activated, would provide feedback that either increases or decreases the activation of the local detectors, depending on their consistency with the global pattern. This, in turn, would increase the feedforward activation from the pattern-consistent local detectors to the global pattern detectors, closing the feedforward/feedback loop.

It is neurally plausible for this kind of recurrent feedforward/feedback loop to affect global pattern formation, because the presence of mylenation results in rapid transmission of activation between different processing levels (on the order of several milliseconds; Girard, Hùpe, & Bullier, 2001; Hùpe et al., 2001; Movshon & Newsome, 1996), whereas the absence of mylenation results in relatively slow, distance-dependent transmission of activation between detectors at the same level of processing (on the order of 15 ms per degree of visual angle for horizontal connections between detectors in area V1, as derived from Chavane et al., 2000; but see also Bringuier, Chavane, Glaeser, & Fregnac, 1999; Grinvald, Lieke, Frostig, & Hildesheim, 1994).

Because it entails the temporal coordination of activation changes occurring both globally and locally, the formation of hierarchical motion patterns is intrinsically dynamical. The purpose of this article is to examine these temporal dynamics.

A fundamental principle of neurally based dynamical models is that perception is embodied in the distribution of activation over ensembles of stimulated detectors. What we see is determined by the attribute selectivity of those detectors that are activated to levels that exceed the read-out threshold required for their perception (Hock, Schöner, & Giese, 2003). Thus, a hierarchical pattern is perceived when a global pattern detector and the local detectors whose selectivities are consistent with the global pattern are simultaneously activated to above-threshold levels (i.e., both the “forest and the trees” are perceived). Although, in some circumstances, a hierarchical pattern might be formed solely by activation feeding forward from the local to the global levels, it will be shown that hierarchical pattern formation for the stimuli studied in this article requires global-to-local feedback in order for the activation of both global detectors and globally consistent local detectors to be stabilized at suprathreshold levels.

The motion quartet

The exemplary stimulus for Hock et al.’s (2003) computational analysis of motion pattern formation was the motion quartet, a much-studied apparent motion stimulus for which directionally selective vertical and directionally selective horizontal motion detectors are simultaneously stimulated but vertical and horizontal motion never are perceived at the same time. Either parallel-path vertical or parallel-path horizontal motion is perceived (Chaudhuri & Glaser, 1991; Hock, Kelso, & Schöner, 1993; Hoeth, 1968; Kruse, Stadler, & Wehner, 1986; Ramachandran & Anstis, 1985; von Schiller, 1933). As is illustrated in Fig. 1, large aspect ratios for the motion quartet favor the perception of parallel-path horizontal motion, whereas small aspect ratios favor the perception of parallel-path vertical motion. (The aspect ratio is the vertical divided by the horizontal distance between the elements.) Hock et al.’s (2003) model showed that parallel-path motion for the motion quartet can result from activation-dependent reciprocal inhibition between local motion detectors responding selectively to horizontal versus vertical motion, with motion over shorter paths being more strongly activated and, therefore, more strongly inhibiting motion over longer paths than vice versa (e.g., vertical motion is more likely than horizontal motion for aspect ratios < 1.0). Global pattern detectors were not required, at least for this level of pattern formation.

Fig. 1
figure 1

Illustration of motion quartet stimuli. The open squares indicate the element locations during odd-numbered frames, and the filled-squares indicate the element locations during even-numbered frames. a Perception is bistable for aspect ratios near 1.0; either parallel-path horizontal or parallel-path vertical motion is perceived. b Large aspect ratios favor the perception of horizontal parallel-path motion. c Small aspect ratios favor the perception of vertical parallel-path motion

In a subsequent computational study, Nichols, Hock, and Schöner (2006) found that distinguishing between different levels of processing is necessary in order to account for the effects of luminance perturbations on the stability of the parallel-path motion pattern perceived for the motion quartet. Hock and Ploeger (2006) had shown that depending on whether the luminance perturbations increased or decreased the activation of local motion detectors, the stability of the parallel-path motion perceived for the motion quartet either increased or decreased.Footnote 1 At the lower level of Nichols et al.’s model, competition between local motion detectors and motion-independent position detectors determines whether or not the stimulus-initiated activation of local motion detectors feeds forward to higher level dynamics. At the higher level, inhibitory competition between detectors that respond selectively to horizontal motion and detectors that respond selectively to vertical motion determines whether parallel-path horizontal or parallel-path vertical motion is perceived.

Nichols et al.’s (2006) two-level model is entirely feedforward. It is consistent with evidence that area V1 motion detectors with the same directional selectivity converge onto larger area MT receptive fields with that directional selectivity (Kohn & Movshon, 2003; Movshon & Newsome, 1996). Although there is no systematic pattern of detector interaction within area V1 (Snowden, Treue, Erickson, & Andersen, 1991), there are consistent inhibitory interactions between detectors responsive to different motion directions in area MT (Heeger, Boynton, Demb, Seidemann, & Newsome, 1999; Recanzone, Wurtz, & Schwarz, 1997; Snowden et al., 1991; Thiele, Dobkins, & Albright, 2000). As was discussed above, Hock et al.’s (2003) model for the motion quartet showed that these cross-direction inhibitory interactions can promote the perception of motions occurring over shorter interelement distances by suppressing motions occurring over longer interelement distances. This is consistent with evidence that (1) local motion detectors are more weakly activated when the motion covers a greater distance (Burt & Sperling, 1981; Gilroy, Hock, & Ploeger, 2001) and (2) inhibitory interactions are activation dependent: the greater the activation of a directionally selective motion detector, the more strongly it inhibits detectors that differ in directional selectivity (Hock & Ploeger, 2006; Levinson & Sekuler, 1975).

Global motion patterns

Global parallel-path motion

When multiple motion quartets are presented in an arbitrary spatial configuration, the perception of global parallel-path motion predominates; either parallel-path vertical or parallel-path horizontal motion is perceived simultaneously for all the motion quartets. If the motion axis for parallel-path motion switches for one of the quartets, it switches for all of them—for example, from all horizontal to all vertical or vice versa (Ramachandran & Anstis, 1985). The multiquartet motion pattern therefore repeats the same parallel-path motion as that perceived for individual quartets and can be accounted for by longer range versions of the short-range, cross-direction inhibitory interactions that are thought to be the basis for the perception of parallel-path motion within individual quartets. For example, if the aspect ratio for one quartet favors horizontal motion, short-range cross-direction inhibition would suppress the less strongly activated vertical motion detectors for that quartet, and long-range cross-direction inhibition would reduce the likelihood of vertical motion perception for the other quartets. Therefore, the activation of higher level, global detectors that respond selectively to parallel-path motion would not be necessary in order to account for the formation of global parallel-path patterns. On this basis, their perception would not be considered hierarchical.

Global rotational motion

An exception to the predominance of global parallel-path motion for multiquartet stimuli is obtained when four quartets are arranged in a diamond configuration. Although global parallel-path motion can be perceived (Fig. 2a), global rotational rocking motion (hierarchical rotation) also can be perceived (Fig. 2b). For the latter, alternating clockwise (CW) and counterclockwise (CCW) global rotation is perceived during successive frames, accompanied by the perception of rotation-consistent local motions for the outermost elements of each quartet. (Motion is not perceived for the inner elements of the diamond configuration, perhaps because it is masked for the inner elements.) The significance of this hierarchical rotational-rocking motion pattern is that it cannot be accounted for by the long-range, cross-direction inhibitory interactions among the quartets that are sufficient for the perception of global parallel-path motion, because vertical motion is perceived for the left and right quartets and, at the same time, horizontal motion is perceived for the top and bottom quartets. (The perception of hierarchical rotation can be seen in demo movie 1. The quartets are further apart in demo movie 2, so global parallel-path horizontal or global parallel-path vertical motion can be perceived.)

Fig. 2
figure 2

Global motion patterns formed from multiple motion quartets. The open squares indicate the element locations during odd-numbered frames, and the filled-squares indicate the element locations during even-numbered frames. a Global parallel-path motion. b Global rotational motion, which is perceived for the outer, but not the inner, elements

The perception of hierarchical rotational rocking implies the following. (1) Motion detectors are activated that have large enough receptive fields to encompass all four motion quartets and respond selectively to either CW or CCW rotation. Such detectors are found in area MSTd (Duffy & Wurtz, 1995; Orban et al., 1992; Tanaka & Saito, 1989). (2) There is excitatory feedback from the global rotation detectors that increases the activation of local motion detectors whose selectivity is consistent with the global rotation and, perhaps as well, inhibitory feedback that decreases the activation of local motion detectors whose selectivity is inconsistent with the global rotation. Activation above the read-out threshold for global rotation detectors and rotation-consistent, directionally selective local detectors would result in the perception of a hierarchical motion pattern. An important feature of this pattern is that it is sufficiently stable at both levels to persist when the aspect ratio of the quartets is changed and when attention is shifted from the global to the local levels. The research that follows indicates the basis for this stability.

Theoretical framework

Although the computational model presented later in this article is specific to the diamond configuration of motion quartets, our goal is to use the diamond quartet stimulus as a model system for investigating the temporal dynamics of hierarchical motion pattern formation. A simplified sketch of the model is presented in Fig. 3.

Fig. 3
figure 3

Sketch of dynamical model incorporating 32 directionally selective local motion detectors (8 for each quartet) and 2 global rotation detectors (clockwise and counterclockwise). Local detectors that respond selectively to horizontal and vertical motions are mutually inhibitory. Their activation feeds forward to the global detectors if their directional selectivity is rotation consistent. Global-to-local feedback is excitatory for local motion detectors whose directional selectivity is rotation consistent and is inhibitory for local motion detectors whose directional selectivity is rotation inconsistent.

The local level in the model is composed of competing, directionally selective detectors for horizontal (left and right) and vertical (up and down) motion. Consistent with the absence of mylenation for horizontal connections between detectors at the same level of processing (Bringuier et al., 1999; Chavane et al., 2000; Grinvald et al., 1994), interactions between spatially separate detecting units depend on the distance between them. Thus, short-range, cross-direction inhibition affects the activation of competing motion directions within each quartet, and long-range inhibitory interactions with the other quartets, although weaker and delayed because of the greater distance, promote the perception of global parallel-path motion. The global level in the model is composed of detectors with larger receptive fields that respond selectively to either CW or CCW rotation. Each receives input from local motion detectors whose directional selectivity is rotation consistent, unless the activation of these local detectors is suppressed by cross-direction inhibition.

Finally, the aspect of the model that is central to the present study is the presence of neural feedback from the global rotation detectors (presumably in area MSTd) to local, directionally selective motion detectors (presumably in area MT). The global-to-local feedback in the model is excitatory for rotation-consistent local motion directions and inhibitory for rotation-inconsistent motion directions. Whether hierarchical global rotation or global parallel-path motion is perceived depends on whether the activation of the rotation-consistent local motion detectors is sufficiently boosted by the excitatory feedforward/feedback loop to overcome the effects of cross-direction inhibition both within and between the motion quartets. Global-to-local feedback would thereby create a detection instability (Bicho, Mallelt, & Schöner, 2000; Hock & Schöner, 2011; Johnson, Spencer, & Schöner, 2009; Schneegans & Schöner, 2008; Schöner, 2008), which implies a self-excitation threshold in addition to the read-out threshold that determines whether or not activation is sufficient for motion to be perceived (Hock & Schöner, 2010). Either activation remains below the self-excitation threshold (and thereby, the read-out threshold as well), or it is sufficient to pass through the self-excitation threshold and initiate an activation-stabilizing feedforward/feedback loop.

Experiment 1

The purpose of Experiment 1 was to provide psychophysical evidence that would confirm the phenomenological impression of rotational rocking for the diamond quartet configuration. If it is, indeed, global rotation that is perceived, the likelihood of its perception will depend on the size of the rotation angle created by the displacement of the elements composing each quartet. This was tested by co-varying the size of each quartet and its distance from the putative center of rotation (i.e., the global radius of the diamond configuration). For the same quartet size, the rotation angle was larger when the global radius was smaller (Fig. 4a), and for the same global radius, the rotation angle was greater when the quartet size was larger (Fig. 4b).

Fig. 4
figure 4

A not-to-scale illustration of the rotation angles intercepted by the outer elements of the motion quartet on the left side of the diamond configuration. The center of the diamond configuration is indicated by a dot. a For the same size quartet, the rotation angle increases with decreased global radius (angle A > angle B). b For the same global radius, the rotation angle increases with increased quartet size (angle C > angle A)

Method

Stimuli

In this and the remaining experiments, the stimuli were centered in the screen of an NEC MultiSync FP 955 monitor and were viewed from a distance of 87 cm, which was maintained with a head restraint. The small white elements composing each quartet (3.4 × 3.4 min; luminance = 81.4 cd/m2) were presented against a dark background (luminance < 0.001 cd/m2).

There were four simultaneously presented motion quartets, each centered on a corner of an imaginary diamond (Fig. 2). They were presented during trials composed of 6 two-frame display cycles (250 ms per frame, except for the first frame of each trial, for which the duration was 1.0 s). Within each quartet, two elements were simultaneously presented in the diagonally opposed corners of an imaginary rectangle during the first frame of each display cycle, followed by their presentation in the other diagonally opposed corners during the second frame of each display cycle, and so on back and forth. In order to create the possibility of rotational motion, the frame-to-frame displacements of the elements within the left and right quartets were 180° out of phase with the frame-to-frame displacements for the left and right quartets. The aspect ratio of each quartet always was 1.0; that is, the horizontal and vertical distances between the elements composing each quartet were equal. The center-to-center distance between the elements defined the size of the quartets; it was 0.11°, 0.20°, 0.34°, or 0.45°. Because pilot work indicated that the direction of the perceived rotation was determined by the element displacements furthest from the center of the display, the global radius of the diamond configuration was defined as the distance from its center to the midpoint of the outermost elements.

Pilot work, which used the same method of constant stimuli as in Experiment 1, altered the range of global radii until values were found for each participant such that rotational rocking always was perceived at one end of the range and never was perceived at the opposite end of the range. For participants S.B. and E.D., the global radius was varied from 0.31° to 0.51° in steps of 0.03° for the quartet size of 0.11°, from 0.71° to 1.1° in steps of 0.06° for the quartet size of 0.23°, from 0.99° to 1.79° in steps of 0.12° for the quartet size of 0.34°, and from 1.16° to 1.96° in steps of 0.12° for the quartet size of 0.45°. For participant M.A., the global radius was varied from 0.34° to 2.33° in steps of 0.29° for the quartet size of 0.11°, from 0.94° to 3.12° n steps of 0.31° for the quartet size of 0.23°, from 1.16° to 4.14° in steps of 0.43° for the quartet size of 0.34°, and from 1.5° to 4.48° in steps of 0.43° for the quartet size of 0.45°.

Design

There were four blocks of trials per testing session, each for a different quartet size; their order was Latin-square counterbalanced over four testing sessions. Each block was composed of 80 trials; eight global radius values were repeated 10 times (their order was randomized within subblocks of 8 trials).

Procedure

Participants were instructed to spread their attention over the entire display while fixating in its center. Global rotation was not perceived without this attentional spread (Balz & Hock, 1997; Castiello & Umiltà, 1990; Hock, Balz, & Smollon, 1998; LaBerge & Brown, 1986). The first frame of each trial was presented for 1 s to allow sufficient time for the preparation of spread attention. After each trial, participants indicated whether or not they perceived rotational rocking by pressing designated keys on the computer keyboard.

Participants

Three Florida Atlantic University students with normal or corrected-to-normal vision voluntarily participated in the experiment. Two, M.A. and E.D., were naïve with respect to the purpose of the experiment. The third, S.B., was an author.

Results

The same pattern of results was obtained for all 3 participants. For each quartet size, global rotational rocking was perceived more often for smaller values of the global radius (Fig. 5a), consistent with the strength of the rotational-rocking percept increasing with increases in the rotation angle for the motion of the outer elements of each quartet. Also consistent with the strength of rotational rocking increasing with increases in rotation angle was evidence that it was perceived more often for larger quartets. This is illustrated by the broken vertical lines in Fig. 5a. For example, when the global radius was 1.1°, S.B. always perceived rotational rocking when the quartet size was 0.34° but perceived it for only 11% of the trials when the quartet size was 0.23°.

Fig. 5
figure 5

Results for each participant in Experiment 1. Graphed are the proportions of trials for which global rotational rocking was perceived a as a function of quartet size and global radius (note that the range of global radii is greater for participant M.A. than for the other two participants) and b as a function of rotation angle, which is calculated from the quartet size and global radius. The broken vertical lines in panel a indicate how the perception of rotational rocking increases with quartet size when the global radius is held constant. The broken vertical lines in panel b indicates that the perception of rotational rocking is maximized for rotation angles approaching 22.5°, which maximizes the rotation angle while minimizing ambiguity between clockwise and counterclockwise rotation directions. There was no indication of any residual trends associated with the different sizes of the motion quartets

The dependence of global rotational rocking on the angle of rotation is further demonstrated in Fig. 5b, where the data points are determined by the rotation angle calculated from the size of the quartet and the global radius for each stimulus. Rotation angle accounted for approximately 80% of the variance in the proportion of trials rotational rocking was perceived, with no indication of residual trends associated with the different sizes of the motion quartets.

It also can be seen in Fig. 5b that the perception of rotational rocking reached ceiling near a rotation angle of 22.5° (indicated by a broken vertical line). The 22.5° displacement is one quarter of the 90° rotational angle separating the outer elements of each quartet from the quartets closest to it. For example, the left-hand quartet can be rotated 90° into alignment with the quartet at the top of the diamond configuration. Rotation angles approaching 45° therefore would be directionally ambiguous, so CW and CCW rotation would be equally likely. Hence, 22.5° is optimal for the perception of rotation; it maximizes the rotation angle while minimizing ambiguity between CW and CCW rotation directions, analogous to direction discrimination for discretely displaced sine gratings being optimal for displacements that are one quarter of the gratings’ wavelength (Nakayama & Silverman, 1985).

Discussion

The results for Experiment 1 indicated that the perception of global rotational rocking increases with increases in the angle of rotation and, thus, with increases in the displacement distance of the outer elements composing each quartet. This is noteworthy because it is opposite to the effect of displacement distance on local motion detector activation, which decreases with increased interelement distance (Burt & Sperling, 1981; Gilroy et al., 2001). It therefore provides evidence that the detection of rotation occurs at a different processing level than does the detection of local motions. In addition, it implies that the beginning and end locations of the local motions detected in area MT map onto corresponding locations within the rotation-selective receptive fields of neurons in area MSTd. This would provide the basis for global activation to be determined by the rotation angle displaced by the elements.Footnote 2

Experiment 2

The purpose of this experiment was to provide psychophysical evidence that the perception of the hierarchical rotation pattern for the diamond quartet configuration entails feedback from global rotation detectors to local, directionally selective motion detectors. To do so, it was determined whether the effects of the feedback would persist and, thereby. affect the perception of rotation-consistent versus rotation-inconsistent local motions when the perception of global rotation no longer was possible.

Each trial was divided into two phases. During phase 1, the elements were displaced for all four quartets. Participants indicated whether or not they had perceived rotational rocking (their response was withheld until the end of the trial). During phase 2, the elements were displaced for only one of the quartets, so the stimulus did not support the perception of global rotation. Participants directed their attention to the top quartet for blocks of trials with aspect ratios equal to or less than one or to the left-hand quartet for blocks of trials with aspect ratios equal to or greater than one. The formation of a hierarchical global motion pattern would be indicated if the perception of global rotational rocking during phase 1 were followed, during phase 2 (when rotational rocking could no longer be perceived), by the perception of local motions in rotation-consistent directions. If this were the case, horizontal motion would be perceived during phase 2 for the top quartet, despite its aspect ratio favoring the perception of vertical motion (blocks of trials with aspect ratios ≤ 1.0), and vertical motion would be perceived during phase 2 for the left-hand quartet, despite its aspect ratio favoring the perception of horizontal motion (blocks of trials with aspect ratios ≥ 1.0).

Method

Stimuli

The quartet size (horizontal interelement distance = 0.34°) and the global radius (0.95°) of the diamond configuration were the same for every trial. What varied from one trial to the next was the aspect ratio of the quartets, which was changed by varying the vertical distance between the elements composing the quartets (see Fig. 1). This was done for the top and bottom quartets by shifting the locations of the inner elements toward the center of the diamond configuration, which kept the global radius constant for all aspect ratios (it was defined with respect to the outer element locations). For the block of trials with small aspect ratios, the aspect ratio for all four quartets was 0.5, 0.58, 0.66, 0.75, 0.83, 0.92, or 1.0. For the block of trials with large aspect ratios, the aspect ratio for all four quartets was 1.0, 1.08, 1.17, 1.25, 1.33, 1.42, or 1.5. There were two phases in each trial, each composed of three back-and-forth cycles (two 250-ms frames per cycle), with the same aspect ratio for the entire trial (the order of the seven aspect ratios was randomized from one trial to the next). The elements were white during the first phase and blue during the second phase. There was no temporal separation between the phases.

For the global-then-local condition (half the trials within each block), the elements were displaced back and forth for all four quartets during the first, global phase of each trial and were displaced for just one of the quartets during the second, local phase. For the only-local baseline condition (the other half of the trials within each block), the elements were displaced during both phases of each trial, but only for one of the quartets (the top quartet for the block of trials with aspect ratios ≤ 1.0 and the left-hand quartet for the block of trials with aspect ratios ≥ 1.0. Hence, the perceived motion direction was determined solely by the aspect ratio of the quartet and, presumably, by within-quartet, cross-direction inhibition. The perception of rotational rocking did not occur in the only-local condition because the elements were displaced for only one of the quartets. Thus, the motion perceived for the designated quartet in the only-local condition provided a baseline, when compared with the local-only second phase of the global-then-local condition, for determining whether global-to-local feedback would overcome the aspect-ratio-dependent effects of cross-direction inhibition.

Procedure

The participants, who were the same as those in Experiment 1, were instructed to spread their attention over the entire display at the start of phase 1 (as in Experiment 1) and then to shift their attention to the quartet designated for that block of trials when the elements changed from white to blue at the start of phase 2. At the end of the trial, they indicated whether or not they had perceived rotational rocking during phase 1 and, then, whether the motion perceived for the designated quartet during phase 2 had been horizontal or vertical.

Design

Blocks of 112 trials were created by the orthogonal combination of seven aspect ratios, whether there was motion for all four quartets (phase 1 of the global-then-local condition) or for only one of the quartets (the only-local condition), and eight repetitions. Trial order was randomized within subblocks of 14 trials. There were four testing sessions for each participant. Each session was composed of three blocks of 112 trials. Blocks of trials with small and large aspect ratios were tested during alternating sessions. In total, there were 48 trials per participant for each aspect ratio in both the global-then-local and only-local conditions.

Results

The results for Experiment 2, which are presented in Fig. 6, were very similar for each of the 3 participants.

Fig. 6
figure 6

Results for each participant in Experiment 2. a For aspect ratios ≤ 1.0, the perception of rotation-consistent horizontal motion is tested during phase 2 with the quartet at the top of the diamond configuration. b For aspect ratios ≥ 1.0, the perception of rotation-consistent vertical motion is tested during phase 2 with the quartet on the left side of the diamond configuration. Included in each graph are the proportion of trials rotational rocking was perceived during phase 1 of each trial, the proportion of these rotational-rocking trials for which rotation-consistent local motions were perceived during phase 2 (both for the global-then-local condition), and the results for the baseline, only-local condition

The only-local condition

The proportion of trials during which parallel-path horizontal motion was perceived for individual motion quartets increased with increases in aspect ratio, as in previous studies (e.g., Hock et al., 1993). This was the case for aspect ratios less than 1.0, for which directional judgments were made for the quartet at the top of the diamond configuration, and for aspect ratios greater than 1.0, for which directional judgments were made for the quartet on the left-hand side of the diamond configuration. The data for the latter trials are presented in Fig. 6b as declining frequency of vertical motion perception.

Phase 1 of the global-then-local condition

The proportion of trials during which global rotational rocking was perceived during phase 1 increased as the aspect ratio increased from 0.5 to 1.0 (Fig. 6a), and the frequency of its perception remained high for aspect ratios ranging from 1.0 to 1.5 (Fig. 6b). The results for the smallest aspect ratios were consistent with the activation of horizontal detectors for the top and bottom quartets being suppressed by cross-direction inhibition to such an extent that global-to-local feedback could not overcome the activational advantage of the aspect-ratio-favored vertical motion. As the aspect ratio increased, cross-direction inhibition would be expected to decrease (vertical detectors would be less strongly activated, so their inhibitory effects would become weaker), increasing the feedforward activation to global rotation detectors and, thus, enabling global-to-local feedback to become more effective in promoting the perception of rotational rocking.

Phase 2 of the global-then-local condition

Evidence for global-to-local feedback came from the difference in the perceived directions of local motion during the second phase of the global-then-local condition, relative to the only-local baseline condition. When rotational rocking was perceived in the global-then-local condition for aspect ratios less than 1.0, the proportion of these rotational-rocking trials for which rotation-consistent horizontal motion subsequently was perceived for the quartet at the top of the diamond configuration was greater than the proportion perceived for corresponding aspect ratios in the only-local condition (Fig. 6a). When rotational rocking was perceived in the global-then-local condition for aspect ratios greater than 1.0, the proportion of these rotational-rocking trials for which vertical motion subsequently was perceived for the quartet on the left side of the diamond configuration was greater than the proportion perceived for corresponding aspect ratios in the only-local condition (Fig. 6b). The difference between the two conditions during phase 2 was statistically significant when aspect ratios were varied from 0.5 to 1.0, F(1, 28) = 15.2, p < .01, and when aspect ratios varied from 1.0 to 1.5, F(1, 28) = 105.4, p < .01.

Correlation

The perception of rotational rocking during the first phase of the global-then-local trials was sometimes, but not always, followed by the perception of rotation-consistent local motions. It can be seen in Fig. 6a that the proportion of trials during which rotational rocking was perceived during phase 1 and the proportion of these “rotational-rocking” trials for which rotation-consistent local motions were perceived during phase 2 were highly correlated for aspect ratios ≤ 1.0, r = .99. The less often global rotational rocking was perceived during phase 1, the less often its perception was followed in phase 2 by the perception of rotation-consistent local motions. There was too little variance to assess this correlation for aspect ratios ≥ 1.0.

Discussion

During the second phase of each trial, local motions were perceived in rotation-consistent directions significantly more often following the perception of global rotational rocking (hierarchical rotation) than in the baseline only-local condition. This provided direct evidence for feedback from global rotation detectors to local motion detectors whose directional selectivity was rotation consistent. It will be shown by our computational simulations that for this to occur, the strength of the global-to-local feedback must exceed the long-range, cross-direction inhibition that promotes the perception of global parallel-path motion. Further computational simulations show that the correlation between the perception of rotational rocking during phase 1 and the perception of rotation-consistent local motions during phase 2 is also attributable to the strength of the global-to-local feedback and, in addition, that the initiation of global-to-local feedback becomes more dependent on stochastic fluctuations for smaller aspect ratios. This dependence results in less frequent perception of hierarchical rotation, weaker carry-forward of its activational effects from phase 1 to phase 2 of each trial and, therefore, less frequent perception of rotation-consistent local motions during phase 2.

Experiment 3

Participants in Experiment 2 indicated at the end of each trial whether or not they perceived rotational rocking. Although not specifically reported, global parallel-path motion usually was perceived when rotational rocking was not perceived. The purpose of Experiment 3 was to show that the perception of global parallel-path motion decreases with increasing aspect ratios and to provide evidence that it competes with global rotational rocking. Both were done by testing for hysteresis effects when the aspect ratio of the quartets was gradually increased from an initial small value or gradually decreased from an initial larger value. It would be indicated if hierarchical rotational rocking, established when the aspect ratio was 1.0, continues to be perceived despite the aspect ratio’s gradually decreasing to values for which global parallel-path motion would otherwise be perceived and if global parallel-path motion, established when the aspect ratio was 0.5, continues to be perceived despite the aspect ratio’s gradually increasing to values for which the hierarchical rotational rocking would otherwise be perceived.

Method

The modified method of limits was first used by Hock, Kelso, and Schöner (1993) to measure hysteresis effects for single motion quartets whose aspect ratio was gradually increased or gradually decreased. With this method, it is possible to determine when perceptual transitions occur between global rotational rocking and global parallel-path motion without requiring a response during the sequence of ascending or descending changes in aspect ratio and, therefore, without concern for how quickly or slowly the observer decides that there was a perceptual change and executes an appropriate response. When the starting aspect ratio was 1.0, global rotational rocking was expected to predominate. When it was 0.5, global parallel-path motion was expected to predominate. The aspect ratio of the quartets was then gradually decreased from 1.0 (descending trials) or gradually increased from 0.5 (ascending trials) by a variable number of steps, so the final end point aspect ratio for each trial also was variable.Footnote 3 For trials with just a few steps, a change in the perceived global pattern was unlikely. As the number of steps increased, the probability of the initial percept persisting for the entire trial decreased. Perceptual hysteresis was indicated when the percept for a particular end point aspect ratio was different, depending on whether the end point was reached via an ascending or a descending sequence of changes in aspect ratio.

The quartet size (horizontal interelement distance = 0.34°) and global radius (0.95°) of the diamond configuration were the same for every trial in this experiment. What varied within each trial was the aspect ratio of the quartets, which was changed in discrete steps by changing the vertical distance between the elements composing the quartets. Ascending trials began with an aspect ratio of 0.5 and increased after each pair of 250-ms frames in steps of approximately 0.08 to an end point aspect ratio of 0.58, 0.66, 0.75, 0.83, 0.92, or 1.0. Descending trials began with an aspect ratio of 1.0 and decreased after each pair of 250-ms frames in steps of approximately 0.08 to an end point aspect ratio of 0.92, 0.83, 0.75, 0.66, 0.58, or 0.5.

Two Florida Atlantic University students with normal or corrected-to-normal vision voluntarily participated in this experiment. Both were naïve with respect to its purpose. The participants were instructed to spread their attention over the entire display for the entire trial, at the end of which they indicated whether their initial percept was of global rotational rocking or global parallel-path motion and, then, whether there was a switch to the other pattern anytime during the trial. Blocks of 60 trials were created by the combination of 12 end point aspect ratios (6 for ascending and 6 for descending trials) and five repetitions. Trial order was randomized within subblocks of 12 trials. There were 12 blocks of 60 trials for participant P.G. (a total of 60 trials for each end point aspect ratio) and 8 blocks of 60 trials for participant L.W. (a total of 40 trials for each end point aspect ratio).

Results

The frequency with which the initial percept switched to the alternative percept is graphed as a function of the trial’s end point aspect ratio in Fig. 7. It can be seen that the likelihood of a perceptual switch occurring during a trial was different, depending on whether the end point aspect ratio was preceded by an ascending sequence of aspect ratios (the vertical axis on the left side of the graph) or a descending sequence of aspect ratios (the inverted vertical axis on the right side of the graph). For example, when the end point aspect ratio was 0.75, L.W. perceived rotational rocking without a switch to parallel-path motion for 94.3% of the descending trials and perceived parallel-path motion without a switch to rotational rocking for 88.6% of ascending trials. Perception therefore was bistable for the aspect ratio of 0.75 and other aspect ratios near it. That is, both global rotational rocking and global parallel-path motion were perceived for the same stimulus, the proportion of each depending on whether the aspect ratio of the quartets increased or decreased prior to reaching the end point value for that trial.

Fig. 7
figure 7

Results for each participant in Experiment 3. Hysteresis effects were obtained by gradually increasing (ascending trials) or gradually decreasing (descending trials) the aspect ratio of the motion quartets. The proportion of switches from global parallel-path motion to global rotational rocking is indicated for ascending trials by the vertical axis on the left. The proportion of switches from global rotational rocking to global parallel-path motion is indicated for descending trials by the inverted vertical axis on the right

The dynamical model

Dynamical models with recurrent feedforward/feedback have previously been developed by a number of investigators to account for various aspects of motion integration and, in particular, the aperture effect (Bayerl & Neumann, 2004; Chey, Grossberg, & Mingolla, 1997; Grossberg, Mingolla, & Viswanathan, 2001; Tlapale, Masson, & Kornprobst, 2010). The purpose of the dynamical model developed in the present article was to determine whether global-to-local feedback can account for the formation of hierarchical motion patterns. To this end, the model includes variables representing the activation of the directionally selective, local motion detectors and CW and CCW global rotation detectors that are stimulated by the diamond quartet stimulus. It was assumed that each activation variable represents the response of an ensemble of detectors at that location with overlapping selectivities.

Focusing the model on the determination of whether global-to-local feedback can account for the formation of hierarchical motion patterns limited its scope. Thus, the spatial location of each quartet is crucial for the activation of global rotation detectors, but variables representing element locations were not explicit in the model. Other global pattern detectors, like those found in MSTd for expansion and contraction (Tanaka & Saito, 1989), would require different locations for the quartets in order to be activated by local motions consistent with these global patterns. Research with stimuli combining rotational rocking with expansion and contraction would require that spatial location become an explicit part of the model.

Within these constraints, the results of all three experiments reported in this article were simulated with the same parameter values (selected to fit the data). Further simulations showed the following. (1) The formation of global parallel-path motion patterns can result from long-range, cross-direction inhibition among local, directionally selective motion detectors. (2) The strength of the global-to-local feedback predicts the frequency with which hierarchical rotational rocking is perceived. It also determines the activational advantage of feedback-excited, as compared with feedback-inhibited, local motion directions, which predicts the frequency with which rotation-consistent local motions are perceived immediately following the perception of hierarchical rotation. (3) Whether hierarchical rotational rocking or global parallel-path motion patterns are formed depends on the relative strength of global-to-local feedback versus long-range, cross-direction inhibition. (4) The initiation of global-to-local feedback and the stabilization of the hierarchical rotation pattern at both global and local levels requires that the activation of the global detectors pass through a detection instability, as described in the introduction (Bicho et al., 2000; Hock & Schöner, 2010, 2011; Johnson et al., 2009; Schneegans & Schöner, 2008; Schöner, 2008). The self-excitation threshold for the detection instability can be exceeded when weak activation of local detectors is supplemented by stochastic fluctuations. Details for the computational simulations that follow are included in the Appendix.

Equations

The dynamical model is composed of the 34 coupled differential equations described in the Appendix. The equations determine how activation will evolve over time for the 32 variables representing the activation of each local, directionally selective motion detector (eight directions for each of the four motion quartets composing the diamond quartet configuration) and the 2 variables representing the activation of the CW and CCW global rotation detectors for the outer rotation path. (Rotation never was perceived for the inner rotation path.Footnote 4)

The equation for each local motion detector determines, at each moment in time, how its activation will change during the next moment in time. It is determined by the rate of change computed from the differential equation for that detector. For example, the activation variable, u Br_B (the rightward detector at the bottom of the quartet that is located at the bottom of the diamond configuration), will increase when its rate of change, du Br_B /dt, is positive and will decrease when it is negative. The direction and size of the activation change will depend on the local detector’s current level of activation, the level of stimulus-initiated activation relative to its no-stimulus resting level, short-range cross-direction inhibition from within the same quartet, long-range cross-direction inhibition from the other quartets, the feedback that is received from global rotation detectors, and finally, random noise. How the activation of each global rotation detector will change in the immediate future likewise depends on its rate of change—for example, whether du CCW /dt is positive or negative. The direction and size of the activation change will be determined by its current level of activation, the feed forward input received from rotation-consistent local motion detectors, and random noise.

With these recursive increases and decreases, the activation levels of the 34 variables evolve over time until they settle at steady-state values for which all remaining changes in activation are due to random fluctuations (e.g., Gilbert & Sigman, 2007; Hock et al., 2003). This occurs when the rates of change of all the activation variables are approximately 0. The model generates motion-perceived signals when the steady-state activation values for the detectors exceed the read-out threshold, which is set at 0.

Stimulus-initiated activation

Local motion detectors with opposing directional selectivity are alternately stimulated with each frame change. For example, leftward motion is stimulated along the top of the quartet on the right side of the diamond (u Tl_R) during odd-numbered frames, and rightward motion is stimulated at the same location (u Tr_R) during even-numbered frames.

The results of Experiment 1 indicated that for a fixed global radius (i.e., the distance from the center of the diamond configuration to the outer elements of the quartets), the perception of global rotational rocking increases with the angle of rotation and, thus, with increases in the distance between the elements composing each of the motion quartets (Fig. 5b). As was indicated earlier, this is opposite to the effect of interelement distance on local motion detector activation, which decreases with increased interelement distance (Burt & Sperling, 1981; Gilroy et al., 2001). In order to reconcile these opposing factors, it was assumed that the input to global rotation detectors entails activation that feeds forward from local motion detectors with rotation-consistent directional selectivity (more activation for smaller interelement distances) and, further, that the beginning and end points of the local motions map onto corresponding locations within the receptive fields of the global rotation detectors. Local detector activations are calculated from the inverse of the distance between the element locations within each quartet, and the rotation angles determined by the beginning and end points of the rotation-consistent local motions are calculated as the arctangent of the interelement distance within each quartet divided by the global radius. (The calculated rotation angle is based on vertical interelement distances for the left and right quartets and horizontal interelement distances for the top and bottom quartets.)

Activation values and rotation angles are then rescaled such that the product of the local detector activation and the rotation angle intercepted by the local motions increases with aspect ratio (see the Appendix for details). The calculated rotation angles are inserted into the equations for the CW and CCW rotation detectors as weighting factors that are multiplied by the activations that feed forward from the corresponding local motion detectors. As a result, there is no global level activation unless there is sufficient stimulus-initiated feedforward activation from rotation-consistent local level detectors, although somewhat weak stimulus-initiated feedforward activation can be sufficient when it is supplemented by stochastic fluctuations.

Cross-direction inhibitory interactions

Competing with the perception of global rotational rocking (i.e., alternating CW and CCW rotation) is the perception of global parallel-path motion, which reflects the parallel-path horizontal or vertical motion that is perceived within individual motion quartets. Whether horizontal or vertical motion is perceived within each motion quartet (they are never perceived simultaneously) is determined in the model by short-range, cross-direction inhibitory interactions, which are implemented with the Naka–Rushton equation (Naka & Rushton, 1966) graphed in Fig. 8a. It can be seen in the figure that cross-direction inhibition can be initiated while activation remains below the threshold level required for perception, as previously determined experimentally by Levinson and Sekuler (1975) and Hock, Schöner, and Hochstein (1996). It is because the strength of cross-direction inhibition is activation dependent that the more strongly activated of the competing motion directions prevails. It inhibits its competitors more than vice versa.

Fig. 8
figure 8

a Short- and long-range inhibitory interaction functions determined by Naka–Rushton equations (Naka & Rushton, 1966). The flatter function for the long-range interaction reflects the slower buildup of long-range inhibition assuming transmission over unmylenated neurons within the same processing module. b Simulation 1: Proportion of trials for which the perception of global parallel-path motion is signified. When aspect ratios are small, motion is likely to be along the vertical axis for all four quartets, even without long-range inhibition, because the small aspect ratios strongly favor the perception of vertical motion. The presence of long-range interactions increases the perception of global parallel-path motion, particularly for the aspect ratios approaching 1.0. c The Naka–Rushton functions for local-to-global feedforward and global-to-local feedback are steeper than the functions for long-range cross-direction inhibition. This is consistent with the presence of neural mylenation between different processing levels in the brain, which results in relatively rapid transmission of excitation and inhibition between the levels

Consistent with horizontal connections between local motion detectors being unmylenated and, therefore, distance dependent (Bringuier et al., 1999; Chavane et al., 2000; Grinvald et al., 1994), the inhibitory interaction function for long-range cross-direction inhibition between the quartets is flatter than that for short-range cross-direction inhibition within each quartet (Fig. 8a). As a result, the effects of long-range inhibition are weaker and delayed by having their influence grow relatively slowly. In the absence of competing global-to-local feedback, the strength of the long range inhibition is nonetheless sufficient to significantly increase the likelihood that all the quartets, regardless of their aspect ratio, have the same parallel-path motion pattern (all vertical or all horizontal). The effect of long-range cross-direction inhibition for quartets with aspect ratios ≤ 1.0 is shown in simulation 1 (Fig. 8b). That is, in the absence of long-range interactions, the perception of the same parallel-path motion for each quartet in the diamond configuration occurs relatively infrequently but almost always occurs when long-range cross-direction inhibition between motion detectors are included in the model.

Feedforward and feedback

The function relating the activation of each rotation-consistent local detector to the activation that feeds forward to a global rotation detector is implemented with a Naka–Rushton equation (Naka & Rushton, 1966); only positive activation values feed forward (Fig. 8c). Also illustrated by the graph in Fig. 8c is the Naka–Rushton equation implementing feedback from global rotation to local directionally selective detectors. These Naka–Rushton functions are steeper than the functions for long-range cross-direction inhibition (Fig. 8a), consistent with the relatively rapid neural transmission between different processing levels in the brain, as compared with neural transmission within the same level of processing.

As was indicated earlier, the stabilization of the hierarchical rotation pattern in the model occurs by virtue of activation passing through a detection instability. That is, global-to-local feedback occurs only when the activation of global rotation detectors crosses a self-excitation threshold, which is set at 0 for the present simulations. (The no-stimulus resting levels for the global detectors are at negative activation levels.) This creates a recurrent feedforward/feedback loop that boosts activation for both global rotation detectors and rotation-consistent local motion detectors and, in addition, inhibits activation for local motion detectors whose directional selectivity is rotation inconsistent. If it is strong enough, the excitatory feedback to rotation-consistent local detectors can exceed the effects of cross-direction inhibition, resulting in the perception of global rotational rocking. If it is not strong enough, global parallel-path motion is perceived.

In the model, cross-direction inhibition within each quartet is the reason global pattern formation for the diamond quartet stimulus is bistable. For example, when there is greater activation of a leftward motion detector at the top of the upper quartet because of global-to-local feedback, less activated vertical motions would be suppressed, despite being favored by the quartet’s aspect ratio when it is less than one. Hierarchical rotational rocking would be perceived. If there is insufficient global-to-local feedback, leftward motion would be suppressed, and vertical parallel-path motion would be perceived. Thus, either hierarchical rotation or global parallel-path motion is perceived, never a mixture of the two.

Simulations of experimental results

The results for all three experiments reported in this article, averaged over the participants, are well-simulated by the computational model, all with an identical set of parameter values that were selected to fit the data (Fig. 9).

Fig. 9
figure 9

a Simulation 2 for the averaged results of Experiment 1, which did not include M.A., for whom the range of global radii tested was different from that for the other two participants. b Simulation 3 for the averaged results of Experiment 2, and c simulation 4 for the averaged results of Experiment 3

Experiment 1

This experiment provided evidence that the detection of rotation occurs at a different processing level than the detection of local motion. This followed from the perception of global rotational rocking increasing with the displacement distance of the outer elements composing each quartet, which is opposite to the effect of displacement distance on the activation of local motion detectors, which decreases with displacement distance. These opposing effects were implemented in the model by having the rotation angle modulate the response of global rotation detectors to the activation that feeds forward from local, directionally selective motion detectors. The successful simulation of the results for Experiment 1 (simulation 2; Fig. 9a) indicated that this separation by level of processing was a computationally plausible way to account for the opposing effects of displacement distance on local motion detection and the perception of rotational rocking.

Experiment 2

The central feature of this experiment was the transition between phase 1 of each trial, during which global rotational rocking was possible because local motion detectors were stimulated for all four quartets, and phase 2, during which global rotational rocking was not possible because local motion detectors were stimulated for only one quartet. Simulation 3, which was done for aspect ratios ≤ 1.0, determined whether the perception of hierarchical rotation was signified during the last two frames of phase 1 and the proportion of these hierarchical trials for which rotation-consistent horizontal motion was signified during the first two frames of phase 2 (at the top of the upper quartet in the diamond configuration). The simulation successfully replicated the correlation between the proportion of trials for which hierarchical rotation (rotational rocking) was perceived during phase 1 and the proportion of these “hierarchy perceived” trials that were followed, at the start of phase 2, by the perception of local motion in rotation-consistent horizontal directions (Fig. 9b). It also replicated the results indicating that local motions for a single quartet were rotation consistent more frequently following the perception of hierarchical rotational rocking (the global-then-local condition), as compared with the baseline only -ocal condition, for which motion directions depended entirely on the aspect ratio of the quartet.

A further examination of the model probed for the basis of the correlation between the frequencies of hierarchical perception during phase 1 and the perception of rotation-consistent local motions during phase 2. Simulation 5 showed that the correlation is attributable to differences in the strength of the global-to-local feedback for different aspect ratios of the motion quartets. Stronger feedback is obtained for aspect ratios approaching 1.0 because rotation-consistent horizontal motions for the top and bottom quartets are not affected as much by cross-direction inhibition from vertical motion as is the case for smaller aspect ratios, so more activation feeds forward to global rotation detectors, as compared with the smaller aspect ratios. Feedback strength was measured midway during frame 5 of the simulated trials for which a hierarchical CCW rotation pattern was formed (see Fig. 10a, which compares the temporal evolution of activation for the rotation-consistent leftward and rotation-inconsistent downward motions of the motion quartet located at the top of the diamond configuration). This measure almost perfectly predicted (r = .99) the frequency with which hierarchical rotation was actually perceived (Fig. 10b).

Fig. 10
figure 10

Simulation 5, which accounts for the correlation in Experiment 2 between the proportion of trials during which rotational-rocking was perceived during phase 1, and the proportion of these trials for which rotation-consistent local motion was perceived during phase 2. a A single-trial simulation of the activation levels for rotation-consistent (counterclockwise [CCW]) leftward motion and rotation-inconsistent downward motion for the motion quartet located at the top of the diamond configuration (aspect ratio = 1.0). Alternating CCW and clockwise (CW) rotation and alternating rotation-consistent (CCW) leftward and rightward rotation-consistent (CW) motions were perceived during phase 1 for this quartet (only leftward motions are indicated). The perception of rotation-consistent local motions is signified during phase 2, during which global rotation was not possible. Illustrated on the graph is when the strength of global-to-local feedback was measured (midway through the fifth frame), and when the difference in subthreshold activation between rotation-consistent leftward motion and rotation-inconsistent downward motion was measured (1 ms before the start of frame 7). b Scatterplot for which each point represents the strength of global-to-local feedback for trials in which the rotational rocking pattern was formed (measured within the simulation), and the actual frequency of its perception during phase 1. c Scatterplot for which each point represents the difference in subthreshold activation between rotation-consistent leftward motion and rotation-inconsistent downward motion measured within the simulation (for trials on which the rotational rocking pattern is formed), and the actual frequency with which the perception of rotation-consistent leftward motion during phase 2 followed the perception of hierarchical rotation during phase 1. d Scatterplot for which each point represents the strength of global-to-local feedback and the subthreshold activation difference between rotation-consistent leftward motion and rotation-inconsistent downward motion, both measured within the simulation. Alongside each point in the three scatterplots is the aspect ratio of the motion quartets

The perception of global rotation was not possible during phase 2 of each trial because motion-inducing element displacements occurred for only one of the four quartets. At the start of phase 2 (frame 7), motion perception for the quartet at the top of the diamond configuration depends in the model on the subthreshold activational advantage of rotation-consistent leftward motion at the end of frame 6, during which CW rotation and rotation-consistent rightward motion are perceived (activation is subthreshold for leftward and downward motions because they are not stimulated during even-numbered frames). The perception of rotation-consistent leftward motion is signified at the start of phase 2 because cross-direction inhibition from the activated rightward motion detector suppresses the activation of the vertical motion detectors that are about to be stimulated. This predisposes the perception of rotation-consistent leftward motion, even though CCW global rotation is not perceived. (Hock et al., 2003, refer to these as future-shaping interactions.) The stronger the global-to-local feedback, the greater will be the inhibition of the rotation-inconsistent vertical motion detectors that will be stimulated in the near future and, therefore, the more likely it will be that motion perceived at the start of phase 2 will be in the leftward direction, consistent with CCW rotation (although the rotation is not perceived).

To confirm the latter, we measured the activational states of the competing leftward and downward motion detectors at the end of frame 6, 1 ms before they were stimulated during frame 7 (Fig. 10a). It can be seen in Fig. 10c that the size of the subthreshold activational advantage of rotation-consistent local motions at the end of phase 1 almost perfectly predicts the actual proportion of trials on which rotation-consistent horizontal motion actually was perceived during phase 2, following the perception of hierarchical rotation during phase 1 (r = .99). This correlation was obtained because the size of the activational advantage for rotation-consistent local motions determines the probability that it will be reversed by a sufficiently large random fluctuation in activation. Large random fluctuations occur less often than small ones, so rotation-inconsistent directions are perceived infrequently when the activational advantage of rotation-consistent directions is relatively large (aspect ratios approaching 1.0). Smaller, more probable random fluctuations would be sufficient to reverse a relatively small activational advantage for rotation-consistent motion directions, which is why rotation-inconsistent motions are more likely for aspect ratios closer to 0.5.

Finally, what knits the above correlations together is the correlation between the two measures taken from the simulation. That is, the strength of global-to-local feedback midway during frame 5 of phase 1 is highly correlated (r = .98) with the subthreshold activation advantage of rotation-consistent leftward motion just before the start of phase 2 (Fig. 10d).

Is feedback necessary?

It might be argued that the results of Experiment 2 can be accounted for by a strictly feedforward model. Accordingly, the rotational-rocking percept would be determined by the vector combination of stimulus-activated local motions within each quartet and feedforward-activated global rotation. The global activation would have to persist across the phase 1/phase 2 transition in order to affect local motion directions at the start of phase 2. However, it can be seen in Fig. 10a that because of the back-and-forth nature of the motion quartet stimulus, phase 1 ends with the perception of CW rotation, but phase 2 begins with local motions that are consistent with CCW rotation. The carry-forward from the end of phase 1 that would lead to local motions consistent with CCW rotation at the start of phase 2 cannot be persistent activation of CCW rotation detectors, because the activation of CW rotation detectors precedes the start of phase 2. As was indicated above, the carry-forward from phase 1 to phase 2 is, instead, attributable to future-shaping, cross-direction inhibitory interactions that are enhanced by global-to-local feedback. This, of course, does not rule out the possibility that, under different circumstances, local-to-global feedforward would be sufficient for the formation of a hierarchical pattern.

Experiment 3

The results of the first two experiments were well simulated by the dynamical computational model, so it was not surprising that the results of Experiment 3 and their simulation also would exhibit hysteresis, a signature characteristic of dynamical models (Hock & Schöner, 2010; Hock et al., 2003; Wilson, 1999). The hysteresis provided evidence for competition between the formation of global parallel-path versus hierarchical rotation patterns (simulation 4; Fig. 9c). That the former depends on long-range cross-direction inhibition and the latter on global-to-local feedback was confirmed by simulation 6, which showed that increasing the strength of long-range inhibition decreases the frequency with which the perception of hierarchical rotation is signified and that increasing the strength of global-to-local feedback increases the frequency with which the perception of hierarchical rotation is signified (Fig. 11a).

Fig. 11
figure 11

a Simulation 6: The proportion of trials during which the rotational-rocking (hierarchical rotation) pattern is formed increases with increases in the strength of global-to-local feedback and decreases with increases in the strength of long-range, cross-direction inhibition. b Simulation 7: The proportion of trials for which the global rotational-rocking (hierarchical rotation) pattern is formed within the simulation depends on the level of random fluctuations for the smallest aspect ratios, but not the larger aspect ratios. Relatively large random fluctuations are needed to initiate global-to-local feedback for stimuli with relatively weak feedforward activation

Detection instability

Global-to-local feedback that is strong enough to overcome the effects of cross-direction inhibition is initiated in the model when there is sufficient feedforward activation from rotation-consistent local motion detectors for global activation to cross the self-excitation threshold and create a detection instability (Bicho et al., 2000; Hock & Schöner, 2010, 2011; Johnson et al., 2009; Schneegans & Schöner, 2008; Schöner, 2008). A recursive feedback loop is initiated in which global-to-local feedback increases the activation of rotation-consistent local motions, which, in turn increases the local-to-global feedforward activation that boosts global activation, and so on. As was indicated earlier, however, it also is possible for global-to-local feedback to be initiated by weaker input activations when stochastic fluctuations momentarily lift the activation of global rotation detectors above the self-excitation level. This is demonstrated by simulation 7 (Fig. 11b). It can be seen that decreasing the magnitude of the stochastic fluctuations for the global rotation detectors reduced the likelihood that there would be a fluctuation large enough for activation to cross the self-excitation threshold. As a result, the hierarchical rotation pattern would be formed less often for stimuli with aspect ratios of 0.66 and 0.58 but would continue to be formed for stimuli with larger aspect ratios. For the latter, it appears that the stimulus-initiated activation that feeds forward from the local to the global motion detectors is sufficient for activation to cross the self-excitation threshold that initiates the recursive feedforward/feedback loop and, thereby, stabilizes the formation of the hierarchical rotation pattern. Stochastic fluctuations were not required.

Whether hierarchical rotational rocking or global parallel-path patterns are formed for diamond quartet stimuli that need stochastic fluctuations in order for global rotation detectors to be activated is determined by what happens when the activation of global rotation detectors reaches the neighborhood of what we have defined as the self-excitation threshold. If the activation of global rotation detectors falls short of this threshold despite the presence of random fluctuations, global parallel-path motion is perceived. If the threshold is crossed, global-to-local feedback is initiated, and the closure of feedforward/feedback loops boosts activation for both the global rotation detectors and the local, rotation-consistent motions to levels that exceed the cross-direction inhibition that promotes parallel-path pattern formation.

Conclusions

The experimental and computational results of this study provide strong support for the role of global-to-local feedback in creating detection instabilities that result in the formation of either a stable hierarchical rotation pattern or a stable global parallel-path motion pattern. The simulations revealed several general features of global-to-feedback that potentially pertain not just to the diamond quartet configuration, but also to hierarchical pattern formation in general. The first is that differences in the frequency with which a hierarchical pattern is formed can be attributed to graded differences in the strength of global-to-local feedback. The second is that the persistence of an activational advantage for pattern-consistent versus pattern-inconsistent local attributes allows for attention to switch, with minimal loss, between the global and local levels of the pattern (as was observed informally for the diamond quartet stimuli). The third is that detection instabilities are created near the self-excitation threshold for the initiation of global-to-local feedback and stochastic fluctuations can contribute to activation crossing this threshold; that is, random noise can increase the possibility of global-to-local feedback.

Our results suggesting that MSTd detectors can play an important role in the formation of stable hierarchical motion patterns are consistent with evidence that MSTd neurons can respond to the rotation of geometric objects (Geesaman & Andersen, 1996) and to changes in an object’s shape (Sugihara, Murakami, Shenoy, Andersen, & Komatsu, 2002). It can be inferred from both findings that MSTd detectors can be used for other than what appears to be their primary function, which entails the detection of the optic flow generated by the perceiver’s locomotion (Duffy & Wurtz, 1995; Orban et al., 1992; Saito, 1993). Accordingly, MSTd detectors that respond selectively to expanding optic flow would help maintain locomotion toward a target (Warren, Kay, Zosh, Ducho, & Sahuc, 2001), and MSTd detectors that respond selectively to rotation would compensate for the effects of head rotation on optic flow (Liu & Angelaki, 2009). The stimuli used in electrophysiological studies of MSTd neurons typically have much denser arrays of local motions than was the case in the present study, for which rotation-consistent local motions were sparse. An interesting possibility suggested by our results is that recursive feedforward/feedback loops could facilitate optic flow detection for relatively sparse optic flow fields and, further, that this could be enhanced by increases in noise levels, which would increase the likelihood that activation will cross the self-excitation threshold and initiate global-to-local feedback.