Introduction

Cognitive psychological research is said to have started in the 1950s (Miller, 2003; Sanders, 1998). Its aim is to understand how the brain processes information, that is, how it transforms, reduces, elaborates, stores, recovers, and uses information provided by the senses and how it controls speech and movement (Neisser, 1967). This information-processing approach is based on the careful scrutiny of behavioral measures like reaction time (RT), movement time, and accuracy in order to reverse-engineer the underlying processing system. It has roots in applied research in the 1940s, but has turned into a functional analysis of human information processing in its own right (Meyer, Osman, Irwin, & Yantis, 1988; Sanders, 1998). The information-processing approach addresses Marr’s (1982) well-known algorithmic/representational level. This level of analysis provides a link between Marr’s computational level (asking what problems the system solves and why it does that), and his implementation level (asking how the system is physically and neurally realized). The information-processing approach can be regarded the successor of behaviorism. This approach to psychology has claimed that it is not possible to study mental processes, and that behavioral research should concern itself with the relationship between environment and observable behavior of people and animals (e.g., Bargh & Ferguson, 2000; Skinner, 1945).

Since the 1970s, technological advances have enabled researchers to assess in increasing detail the regional activity in the brain that is associated with information processing using techniques like EEG, PET and fMRI (Gazzaniga, Ivry, & Mangun, 2013). Simultaneously, the availability of increasingly powerful computers has enabled computational modeling of both cognitive and neural processes (Anderson, 1983; Anderson, Bothell, Byrne, Douglass, Lebiere, & Qin, 2004; De Garis, Shuo, Goertzel, & Ruiting, 2010; Goertzel, Lian, Arel, de Garis, & Chen, 2010; Kandel, Markram, Matthews, Yuste, & Koch, 2013; J. E. Laird, 2012). In recent years, research using behavioral, neural, and computational indices of behavior is gradually merging into what has been termed cognitive neuroscience (e.g., Gazzaniga et al., 2013).

A problem we address in the present paper is that cognitive neuroscience research does not benefit as much from cognitive psychological theorizing as it could in that theorizing in these domains is still quite distinct (Forstmann, Wagenmakers, Eichele, Brown, & Serences, 2011; for interesting exceptions, see e.g. Anderson et al., 2004; Zylberberg, Dehaene, Roelfsema, & Sigman, 2011). One reason is that cognitive psychological research has not yet provided clear theoretical perspectives on the underlying cognitive processing architecture. Instead, most cognitive models are developed for a particular experimental paradigm without making clear how the proposed cognitive processes relate to those proposed by other cognitive models (for a classic and still valid critique; see Newell, 1973). As a consequence, the number of models accounting for human behavior continues to proliferate to the point that some models are simply forgotten over time (see, e.g., Abernethy & Sparrow, 1992). In the present article, we deal with this problem by addressing communalities across three information processing models. Two of these models are based on classic research methods, the Additive Factors Method (Sanders, 1990, 1998) and the Psychological Refractory Period paradigm (Pashler, 1994). The third is a cognitive model of sequential motor behavior, the Dual Processor Model, that has been proposed by the first author of the present article (Verwey, 2001). On the basis of these models, we propose a framework called the Cognitive framework for Sequential Motor Behavior (C-SMB). This framework is argued to describe information processing in many tasks, including the execution of sequential movements.

We then use our framework to focus on the problem in motor behavior research that researchers sometimes do not seem to realize that the same movement sequences can be executed with different processing strategies. This relates to the idea that, while not always acknowledged in the cognitive and movement science research communities, producing movement sequences is a cognitive task that also relies on central and perceptual processes (Rosenbaum, 2005; Rosenbaum, Chapman, Coelho, Gong, & Studenka, 2013). The various strategies to produce movement sequences do not only differ across participants, but even individual participants appear to sometimes switch between execution strategies (e.g., following an error; Jentzsch & Dudschig, 2008; Notebaert et al., 2009). We address this problem by proposing a classification of sequencing strategies that can be used as a tool to design serial movement studies and to interpret the results of these studies.

The Dual Processor Model

We start off with an introduction of the Dual Processor Model because this model stands at the basis of the proposed processing framework (Verwey, 2001; for reviews, see Abrahamse, Ruitenberg, De Kleine, & Verwey, 2013; Rhodes, Bullock, Verwey, Averbeck, & Page, 2004). The Dual Processor Model is based on research with the Discrete Sequence Production (DSP) task. This task is characterized by sequence elements that take very little time to produce, namely key presses. Using such fast and simple movements allows reaction times to reflect the responsible cognitive processes that may remain concealed with other sequential movement tasks (Rhodes et al., 2004). Furthermore, the high execution rates reached with this task make it likely that execution is based on a single strategy that outperforms other ones.

Participants performing the Discrete Sequence Production task initially respond to each of a short series of (typically 6) stimuli by pressing the corresponding key. Fingers of individual participants are counterbalanced across sequential positions to eliminate finger-specific effects on responses at a particular sequential position (as reported by, e.g., Adam, 2008; Leuthold & Schröter, 2011). Because there are two alternative sequences, each starting with another stimulus, participants gradually learn to respond to the display of the first stimulus by executing the entire sequence while ignoring the subsequent stimuli. This turns the task into a 2-choice RT task with familiar keying sequences as responses. The first sequence element (i.e., key press) is typically quite slow while the ensuing ones are much quicker (Fig. 1). The results of this type of task are explained by the notion that participants develop a representation linking two or more key presses together into what is called a motor chunk Footnote 1 (Verwey, 1996; for older references to a similar construct, see Gallistel, 1980; Keele, 1986; Leonard & Newman, 1964; Newell & Rosenbloom, 1981).

Fig. 1
figure 1

Typical results obtained with a practiced participant executing a 6-key sequence in the Discrete Sequence Production task. With smaller sequence lengths (≤5 key presses). the relatively slow response time half way through (indicating ‘concatenation’) is usually not observed (Copyrights granted by Abrahamse et al., 2013)

When 6-element sequences are executed there is often a relatively slow element somewhere in the middle of the sequence. This slower element is thought to occur with 6-key sequences because motor chunks can represent subsequences with up to only 4 or 5 elements (e.g., key presses), so that a second subsequence is needed (Acuna et al., 2014; Verwey, Abrahamse, & Jiménez, 2009; Wymbs, Bassett, Mucha, Porter, & Grafton, 2012). The initiation of the second subsequence is slowed because the second motor chunk needs to be selected and loaded at the so-called concatenation point, and this cannot (entirely) concur during execution of the first subsequence (Verwey, 2001).

One reason to present our cognitive framework for the production of movement sequences is to increase the awareness of cognitive scientists and cognitive neuroscientists of the processing complexities involved in preparing and executing even relatively simple motor sequences. This is important because behavioral research can provide additional constraints that make the study of the neural basis of human sequential behavior more goal directed and efficient (Forstmann et al., 2011). Cognitive neuroscience studies often use research paradigms from behavioral research, like the serial RT task, the Discrete Sequence Production task, and sequences of aimed movements, but they often do not use the behavioral models derived from those studies. A case in point are the indications that different processing strategies are used when performing different types of sequencing tasks (Verwey & Abrahamse, 2012; Verwey & Wright, 2014). These processing strategies involve different neural mechanisms (Debaere, Wenderoth, Sunaert, Van Hecke, & Swinnen, 2003; Jueptner et al., 1997), but this often is ignored when interpreting neural activity. It should be noted that even cognitive architectures like ACT-R (Anderson et al., 2004), SOAR (J. E. Laird, Newell, & Rosenbloom, 1987), and EPIC (Meyer & Kieras, 1997; for an overview, see, e.g., Goertzel et al., 2010) attribute serial skill to just one mechanism (namely increasing the rate of selecting responses, e.g., Lebiere & Wallach, 2001), rather than taking into account the ability to switch to other processing strategies, and using different types of representations.

Terminology

Before introducing a general framework for sequential motor behavior, we introduce a terminology because terms differ considerably across the various research paradigms. We refer to a representation where sometimes a mental construct or internal code is used. Such representations include memory chunks (Cowan, 2000; Miller, 1956; Newell & Rosenbloom, 1981), motor chunks (Verwey, 1999), and motor programs (Schmidt, 1975). Representations include information in a particular format or code, which may be verbal, spatial, or motoric. An information process translates an input into an output representation. A processing stage differs from a process in the sense that processing stages are defined by the results of the Additive Factors Method and may include several serial and/or parallel processes (Sanders, 1998). Processes are carried out by a processor and we here distinguish between processors at the perceptual, central and motor level. While the Dual Processor Model distinguishes a cognitive processor and a motor processor (Verwey, 2001), we here refer to the former processor as central processor (cf. Pashler, 1994). This is because the term cognitive is often used to refer to processing at any level (Neisser, 1967). Finally, because practicing a motor skill typically involves movement sequences that require no guidance by movement-specific stimuli, and actions may be less concrete than movements, we here prefer the term movement sequence over terms like response sequence, action sequence, and movement pattern.

The Cognitive framework for Sequential Motor Behavior

In this section, we describe the six assumptions of the Cognitive framework for Sequential Motor Behavior. As said, these assumptions are inspired by the Additive Factors model (Sanders, 1990, 1998), the bottleneck model for the Psychological Refractory Period task (Pashler, 1994), and the Dual Processor Model (Abrahamse et al., 2013; Verwey, 2001). In short, these assumptions are (1) that knowledge is represented in perceptual, central-symbolic, and motor representations and that these may become part of a multidimensional representation, and (2) that short-term storage of information involves two partly overlapping stores, short-term memory and the motor buffer; (3) information is processed by sensory modality-specific processors at the perceptual level, a central processor, and output modality-specific motor processors; (4) these processors process information in a limited number of successive processing stages; (5) processors can operate independently, but they obviously interact in various ways, whichs can account for several phenomena observed with executing movement sequences, including dual task interference; and (6) executive control of the information processing system is a function of the central processor that can pre-activate during a preparation phase the structures that underlie processes and representations to be used later on. Basic to the C-SMB framework is that sequencing skill is based on two general strategies, namely responding to series of stimuli (i.e., external or stimulus-based control), and using sequence-specific representations (i.e., internal or plan-based control; Keele, 1968; Tubau, Hommel, & López-Moliner, 2007). Depending on the task at hand, each of these strategies can be used in various ways.

Assumption 1: Representations

We postulate that representations are perceptual when they are involved in, and result from, perceptual processing. Motor representations exist at the motor system level. Finally, representations at the central level that are not directly related to perceptual and motor processing are called central-symbolic. These central-symbolic representations are typically more complex. They are based on (i.e., grounded in) low-level perceptual and/or motor representations (Fischer & Zwaan, 2008; Goldfarb & Treisman, 2013; Stoet & Hommel, 1999; Treisman & Gelade, 1980), and may include verbal coding (Tubau et al., 2007). This assumption implies that the distinction between perceptual, central and motor representations is gradual and depends on the processes using them. Indeed, movement representations like the motor program (Schmidt, 1975; Schmidt & Lee, 1999) and hierarchical movement representations (Rosenbaum, Hindorff, & Munro, 1987) seem to include perceptual (e.g., expected feedback), central, and motor representations. The reason to still make a distinction between perceptual, central-symbolic, and motor representations is that when these representations have no overlap they can be used by independent processes (like when counting and executing a movement sequence at the same time; Verwey, Abrahamse, & De Kleine, 2010; Verwey, Abrahamse, De Kleine, & Ruitenberg, 2014).

With respect to serial behavior, this representation assumption implies that the same movement sequence can be represented in many ways, sometimes even at the same time. Familiar movement sequences may be stored as a unified representation in long-term memory that can be retrieved in a single operation and that may include motor, spatial, and/or verbal information (e.g., a remembered phone number). In the case of unfamiliar sequences, movement representations may be constructed step by step in short-term memory before or during movement execution. This may happen when a series of stimuli is displayed in advance (De Kleine & Van der Lubbe, 2011).

Assumption 2: Two independent temporary storage facilities

Cognitive theories claim that short-term memory consists of the temporary activation of representations in long-term memory (e.g., Anderson, 1983; Cowan, 1988, 1995). The links connecting the features that make up representations may be temporary, in which case these features are said to be bound. They may also be permanent and then the features are associated (Barber & O'Leary, 1997; Hommel & Colzato, 2009; Zorzi & Umiltá, 1995). In the case of permanent associations between lower order features, the entire representation is stored in long-term memory.Footnote 2 In that case, activating a few features is—given concurrent activation of a particular goal and context (Ruitenberg, Abrahamse, De Kleine, & Verwey, 2012; Ruitenberg, De Kleine, Van der Lubbe, Verwey, & Abrahamse, 2012)—likely to activate (i.e., prime) the entire representation.

C-SMB assumes two partly overlapping temporary storage facilities for the production of movement sequences, a general short-term memory and a motor buffer. This corresponds with studies that concluded that the motor buffer is functionally separate from short-term memory (Gordon & Meyer, 1987; Magnuson, Robin, & Wright, 2008; Rosenbaum, Kenny, & Derr, 1983; Smyth & Pendleton, 1989; Sternberg, Knoll, Monsell, & Wright, 1988; Tattersall & Broadbent, 1991; Verwey, 1999).Footnote 3

C-SMB assumes that short-term memory holds central-symbolic (i.e., non-motor, spatial and verbal) representations of movements and movement sequences, as well as other information (like task goals). The motor buffer contains motor representations. These motor representations involve concrete movement features that need little further processing to produce actual movement, such as limb used, agonist–antagonist muscle activation patterns, sequential patterns of muscle-joint angles, and torques (Hikosaka et al., 1999; Shea, Kovacs, & Panzer, 2011). Higher-level processes provide these low-level, task-specific motor features with motor parameters. These parameters include movement direction (Rosenbaum, 1980), movement goal in egocentric coordinates (Willingham, Wells, Farrell, & Stemwedel, 2000; Witt, Ashe, & Willingham, 2008), movement timing (Klapp, 1995; Klapp & Jagacinski, 2011), movement force (Schmidt, 1975), and, in the case of movement sequences, movement order (Sternberg, Monsell, Knoll, & Wright, 1978; Verwey, 2001). The motor representations probably also include information on how the movements are adjusted to the biomechanics of the effector (i.e., effector-specific learning; Andresen & Marsolek, 2012; Park & Shea, 2003; Verwey & Clegg, 2005; Verwey & Wright, 2004), and how successive movements can be smoothly integrated into a movement sequence (i.e., coarticulation; e.g., Mattys, 2004; Shaffer, 1975).

Studies with the Discrete Sequence Production task have indicated that the capacity of the motor buffer is limited to about 3–5 elements (e.g., Verwey et al., 2009; Verwey & Eikelboom, 2003). This suggests that familiar movement sequences that are quite long generally require a succession of motor representations (Acuna et al., 2014; Bo & Seidler, 2009; Fendrich & Arengo, 2004; Kennerley, Sakai, & Rushworth, 2004; Verwey et al., 2010).

In a number of situations, representations in the two short-term storage facilities may be closely associated so that a single representation encompasses both short-term memory and the motor buffer. This probably holds for a fully specified motor program that involves connected central-symbolic and motor features (Schmidt, 1975). Other studies explicitly assume that motor representations also include the expected sensory feedback that typically results when that movement is executed (Wolpert, Ghahramani, & Flanagan, 2001). This feedback serves two purposes. First, movement features in long-term memory may be activated via the representation of the intended sensory feedback (Herwig, Prinz, & Waszak, 2007; Hommel, Müsseler, Aschersleben, & Prinz, 2001). Second, the expected feedback allows ongoing movements to be monitored for errors and, in the case of relatively slow movements, to correct these movements (Adams, 1971).

The assumption of an overlap between short-term memory and motor buffer explains indications of simultaneous execution of controlled and automatic processes within the response selection processing stage (as suggested by research into stimulus–response compatibility, ideomotor compatibility, practice, and the Stroop task). During this processing stage, one process may well identify the response movement in a controlled way, while another, parallel, process involves the automatic, direct priming of response features by stimulus features (Hommel et al., 2001; Kornblum, Hasbroucq, & Osman, 1990; Lien, McCann, Ruthruff, & Proctor, 2005; Logan, 1988). For instance, perceiving a stimulus at the left is likely to activate a ‘leftness’ feature that primes a particular movement (if the other features have already been prepared), while goal-based control processes can trigger movement to the right.

C-SMB assumes that, before movement can commence, the central processor loads the features of the movement into the motor buffer and short-term memory. This loading process is the result of what has been referred to as motor program activation and parameter specification (Rosenbaum, 1980; Schmidt, 1975), and as storing individual sequence elements in the motor buffer before sequence execution (Verwey, 1996). In the case of the repeated use (and co-activation) of these movement elements, their features become associated through Hebbian learning (‘what fires together, wires together’). With extensive practice and when stimuli directly map with responses, motor parameters may also become closely associated with (the features making up) a particular motor program (as shown by Diedrichsen, Hazeltine, Kennerley, & Ivry, 2001; Goodman & Kelso, 1980; Keetch, Schmidt, Lee, & Young, 2005). This implies that, following consistent practice in a particular context, the entire motor representation can be loaded from long-term memory into the motor buffer in a single processing step by activating just a few features of that representation.

Assumption 3: Processors at three processing levels

At the input level three separate perceptual processors exist for the visual, auditory and proprioceptive modalities (Fig. 2). At the output level, there are two separate motor processors, one for the hand/foot modality and one for the speech modality (Pashler & Christian, 1994; Tattersall & Broadbent, 1991).Footnote 4 In between the perceptual and motor processors there is a single central processor. Unlike the perceptual and motor processors, the central processor is versatile and may not always behave as a single unit (Fodor, 1983; Uttal, 2001). In a Discrete Sequence Production task context, this central processor is responsible for preparing and initiating both unfamiliar and familiar movement sequences, but it can also trigger individual movements of a familiar sequence, identify tones, and increment a counter in memory (Verwey et al., 2010, 2014). The central processor is assumed to also perform executive control functions like setting task goals, preparing the relevant perceptual and motor processors, and keeping central processes active. So, the central processor makes extensive use of short-term memory, loads the motor buffer, and is responsible for a variety of additional processes.

Fig. 2
figure 2

C-SMB assumes three processors at the perceptual level (namely visual, auditory, and proprioceptive processors), a central processor system usually acting as a single functional processor, and two motor processors (manual/feet, speech processors). The depicted overlap between short-term memory (STM) and the motor buffer represents the storage of features with joint perceptual and motor significance (like ‘left’ and ‘right’). The cognitive and motor loops occur when information in short-term memory (STM) or in the motor buffer is repeatedly cycled through to deduce successive elements from a compound sequence representation

The notion that independent, limited capacity, processors are active at various levels of information processing is not new or controversial (Allport, 1980; Anderson, 1983; Kahneman, 1973; Pew, 1966; Schmidt, 1988). Separate processors at three processing levels is, by and large, consistent with the different types of processing resources that account for the patterns of interference between two simultaneously executed tasks (Wickens, 1984, 2008), and also with the assumption of the Theory of Event Coding that “early perceptual” and “late motor” processes are carried out independently from central processes (Hommel et al., 2001). The notion of a limited set of processors has been further used to explain the central bottleneck that is responsible for slowing the second of two successive choice reaction time tasks in the Psychological Refractory Period task (Meyer & Kieras, 1997; Ruthruff & Pashler, 2001; Tombu & Jolicœur, 2005).

C-SMB assumes that if a stimulus is presented, its features are processed by a perceptual processor that transmits its output to the central processor by loading a perceptual representation into short-term memory. This output often consists of a complete stimulus representation, but, in some speeded tasks, individual features of a stimulus may be transmitted successively (Miller, 1990; Sanders, 1990). In reaction tasks, the central processor uses this stimulus representation to identify the stimulus, and to construct a new movement representation or to select an existing one. This process is facilitated and biased by the current task context (Ruitenberg et al., 2012b; Wright & Shea, 1991). In a next processing stage, the central processor deduces the low level features of the activated movement representation that are then stored in the motor buffer where they are joined with stimulus-independent, task-dependent motor features that have already been stored during preparation. Determining each successive movement or movement feature from short-term memory may involve the central processor repeatedly cycling through a cognitive loop (Verwey, 1994). Once the motor buffer contains all necessary information, the motor processor starts executing the motor buffer content. The motor processor, evidently a highly complex piece of machinery that includes many feedback loops to produce each movement, relatively rigidly executes the motor buffer content. In the case where the motor buffer contains representations of several successive movements, the motor processor cycles through a motor loop to assess and execute each ensuing movement (Sternberg et al., 1978). In that case, the central processor is required only for initiating and not for executing the sequence.

Assumption 4: Processing stages and processes

Processing by the proposed C-SMB processors involves a number of distinguishable processing stages. This is indicated by reaction time research using the Additive Factors Method (Sternberg, 1969; for reviews, see Sanders, 1990, 1998). In this research, the existence of processing stages is derived from the effect on reaction time of pairs of experimental manipulations. The actual order of these processing stages is then inferred on logical grounds. Sternberg (1998, 2001) argued that the indications for serial processing stages result either from a single processor switching from one to the next processing stage, or from one processor waiting for the output of another processor. This fits our assumption of a central processor responsible for several central processing stages, flanked by processors at the perceptual and motor level. To prevent accumulation of inaccuracies across stages, each processing stage is assumed to have a constant output quality so that extra processing demands (e.g., when stimulus quality is poor) lead to longer processing durations (Sackur & Dehaene, 2009; Sanders, 1990).

Serial processing models have been refuted in the past because they would not acknowledge parallel processes and feedback loops. Also, they would not account for tasks in which stimuli prime response movements in unintended ways (like with the Eriksen flanker effect and the Stroop task). This priming shortcuts central processing stages and breaks down the additivity of central processing stages like stimulus identification and response selection (McClelland, 1979; Pashler & Baylis, 1991; Stafford & Gurney, 2011). A distinction between processes and processing stages can deal with these issues by assuming that various parallel processes and feedback loops occur within a single processing stage. For instance, response selection is assumed to include an automatic associative process and a controlled process in parallel (Adam & Koch, 2009; Kornblum et al., 1990; Lien et al., 2005).

Stimulus-based priming of movements can be dealt with by the assumption that short-term memory and the motor buffer overlap where it concerns features common to late perceptual and early motor representations (see Fig. 2; Hommel et al., 2001). The dispute as to the interpretation of additivity and interactivity of experimental manipulations in particular tasks has not yet ended, and it is clear that the postulate of serial processing stages may not hold in all situations (McClelland, 1979; Meyer, Yantis, Osman, & Smith, 1985; Sanders, 1990; Stafford & Gurney, 2011). Yet, we argue that a serial processing stage model still remains a useful metaphor for the neural processing mechanisms carried out in most choice reaction tasks, even after extensive practice (Kamienkowski, Pashler, Dehaene, & Sigman, 2011).

Figure 3 presents the Additive Factors Model for choice reaction time tasks with seven processing stages (Sanders, 1990, 1998). It postulates that producing a response movement involves the retrieval from long-term memory of an abstract motor program at the response selection stage. This is followed by motor programming/parameter specification, by loading the resulting motor program into a buffer during program loading/unpacking, and then by making some final motor adjustments (cf. Lien et al., 2005; Rosenbaum, 2013).

Fig. 3
figure 3

The processing stages assumed by the Additive Factors model and the Dual Processor model, and the processors responsible for them. Center left column: the processing stages of the Additive Factors model along with the variables influencing them (Sanders, 1990, 1998). Left column: the processors assumed by C-SMB to be responsible for these processing stages (see text). Center right column: the motor stages postulated by the Dual Processor Model in the case of movement sequences of up to 3 to 5 elements (Verwey, 1994, 2001; following in part, Sternberg et al., 1978). Right column: type of information processed. The cognitive loop and motor loop arrows correspond with the plan-based and chunking execution modes, respectively

In the case of movement sequences, Sternberg et al. (1978, 1988; see also Verwey, 1994, 2001; Sanders, 1998) proposed with the Subprogram-Retrieval Model that buffer loading is followed by three additional processing stages to produce each individual movement, namely buffer search, unpacking, and execution. During the buffer search stage, the next elements in the motor buffer are successively located in a self-terminating manner. During the unpacking stage, the retrieved movement is readied, and, during the execution stage, the required commands are issued to the motor system. Sternberg et al. (1978) argued that the duration of the buffer search stage is affected by sequence length. The duration of the unpacking stage is probably affected by the nature of the individual movements (cf. Klapp, 1995, 2003).

Importantly, studies with the Psychological Refractory Period paradigm and the Discrete Sequence Production task allow allocating the processing stages of the Additive Factors Model and Subprogram-Retrieval Model to the above proposed processors. The observation that the second of two successive choice RT tasks in the Psychological Refractory Period paradigm is slowed by the first choice RT task is generally explained by a central processor being allocated to the second task only after it has completed the processing stages of the first task (Byrne & Anderson, 2001; Meyer & Kieras, 1997; Pashler, 1984; Pashler & Christian, 1994; Ruthruff & Pashler, 2001; Ruthruff, Pashler, & Klaassen, 2001; Tombu & Jolicœur, 2005). Response selection is the prototypical processing stage that would be responsible for the central processing bottleneck (Pashler, 1994), but the central bottleneck was later found to also affect stimulus identification, mental rotation, and other processing stages involving the retrieval of knowledge from memory (Johnston & McCann, 2006; Meyer & Kieras, 1997; Pashler, 1994; Pashler & Christian, 1994; Ruthruff & Pashler, 2001). In contrast, the perceptual and motor processing stages are not subject to this central bottleneck. In some situations, response initiation was found to impose a second bottleneck (De Jong, 1993). This is in line with the motor processor performing one process after the other.

Various studies on the production of movement sequences confirmed that response selection is carried out by other than the motor processor because selecting forthcoming movements can occur while earlier movements are being executed (Garcia-Colera & Semjen, 1988; Klapp & Jagacinski, 2011; Rosenbaum et al., 1987; Sternberg et al., 1988; Sternberg et al., 1978; Verwey, 1995). In the case of discrete keying sequences, the selection of an entire motor chunk can also occur while a preceding sequence is being carried out (Verwey, 2001). These findings provide further support for the independence of a central and a motor processor. That other central processor tasks can be carried out during sequence execution, is shown by the recent observation that counting targets—which due to its dependence on short-term memory is a typical central processor function (Bajic & Rickard, 2011; Logie, Gilhooly, & Wynn, 1994)—occurred while a sequence was carried out (Verwey et al., 2014).

The reason to propose that the processing stages preprocessing, feature extraction, and habitual forms of identification are carried out by a perceptual processor (Fig. 3) is that Psychological Refractory Period studies show that the last of these processing stages, stimulus identification, usually precedes the central processing bottleneck (Pashler & Johnston, 1989). Only with more complex forms of identification, like with uncommon, rule-based classifications, stimulus identification also appears subject to the central bottleneck (Johnston & McCann, 2006). Indeed, when in a Discrete Sequence Production task a tone presented during execution of a familiar sequence was to be classified in an arbitrary way as low or high, the results indicated that this uncommon identification process requires central processing (Verwey et al., 2010, 2014). In line with other researchers (Bajic & Rickard, 2011; Pashler, 1994; Ruthruff et al., 2001), we conclude that the central processor is involved in stimulus identification when the classification requires the application of temporary rules in short-term memory (e.g., is 6 < 8?), whereas it is not involved in stimulus identification in the case of consistent classifications (like, is 6 a number?).

We propose that the buffer loading stage in the Dual Processor Model (which is probably equivalent to the motor programming stage in the AFM model; see Fig. 3) is the last processing stage carried out by the central processor. This is in line with the results of one particular Psychological Refractory Period study that used a first task consisting of a sequence of one to five key presses (Pashler & Christian, 1994). It appeared that initiating the first response of the sequence did not show the typical sequence length effect (Sternberg et al., 1978), but the sequence length effect did emerge in the response time of the second (vocal) task. Still, this second task was initiated before the first task (i.e., the sequence) had been fully completed. These two observations, too, can be explained by the distinction between a central and a motor processor. We argue that the first response of the keying sequence was selected and immediately executed, and that, while the motor processor was executing this response, the remainder of the sequence was loaded by the central processor into the motor buffer (Klapp & Jagacinski, 2011; Portier, van Galen, & Meulenbroek, 1990; Schröter & Leuthold, 2009). This took more time with longer sequences so that the central processor could accommodate the second task later as the first response involved a longer sequence. Only after all sequence elements had been loaded into the motor buffer (and sequence execution continued), did the central processor serve the second task. This interpretation explains the sequence length effect on the second task. The finding that, during execution of the second, vocal, task the remainder of the first task’s sequence was also carried out, is consistent with the assumption of two separate motor processors.

In short, C-SMB assumes that the processing stages performed by the central processor in choice reaction time and sequencing tasks include memory-demanding stimulus identification, response selection, mental rotation, target counting, and motor buffer loading. The reason to also address perceptual processes here is that some findings with the serial RT task can be explained only by the notion that sequence learning involves priming at the perceptual level (Abrahamse, Jiménez, Verwey, & Clegg, 2010). Below, we come back to this. Finally, the processing stages required to execute the content of the motor buffer are carried out by the motor processor. In the case of movement sequences, these processing stages include Sternberg et al.’s (1978, 1988) buffer search, unpacking, and execute processing stages.

Assumption 5: Racing processors and dual task interference

The conclusion that the perceptual, central and motor processors may perform different processing operations in parallel does not only explain the many indications that central processes are active while a movement sequence is being carried out (Verwey, 2001). It can also account for indications that a cognitive secondary task (like counting tones) can slow down ongoing sequence execution. Rather than attributing this task interference to loading a graded central processing resource (Kahneman, 1973; Wickens, 1984, 2008), it is explained by the notion that the central processor usually races with the motor processor to produce each next movement, and that the contribution of the central processor to this race is eliminated in the case of a secondary task (Verwey, 2001; Verwey et al., 2014). Indeed, it has been shown that, when the distributions of processing times of each processor overlap, the resulting movement production times are shorter than when only one processor is involved (i.e., statistical facilitation; Raab, 1962; Verwey, 2003b). During this race, the selection of individual responses by the central processor may be based on external guidance by stimuli (Verwey, 2001), or on a representation in short-term memory (e.g., a verbal or spatial sequence description, Ruitenberg et al., 2012; Seidler, Bo, & Anguera, 2012; or a “plan”, Tubau et al., 2007).

Assumption 6: Cognitive control

Cognitive control is exerted by the central processor pre-activating the required processes and loading relevant information into short-term memory. If participants prepare to respond to a stimulus (i.e., in the so-called stimulus-based or externally guided mode; Herwig et al., 2007; Tubau et al., 2007) they actually pass on control to the display of a limited set of stimuli. Setting this external control mode involves the advance loading of a stimulus set and of stimulus–response translation rules into short-term memory. Further, the response set is prepared by pre-loading the stimulus-independent movement features into the motor buffer. These steps ready the system to act upon display of a particular stimulus in a reflex-like way (Exner, 1879; Hommel, 2000).

In the internally guided intention- or plan-based mode movements are determined by a performer on the basis of her or his current goals (Herwig et al., 2007; Tubau et al., 2007). These behavioral goals may be derived from a plan that specifies successive movements in terms of their sensory feedback (i.e., the action effect; Elsner & Hommel, 2001; Herwig et al., 2007; Herwig & Waszak, 2012; Janczyk, Heinemann, & Pfister, 2012).Footnote 5 This type of plan is abstract in that it contains perceptual and/or verbal representations needed to select successive movements (Tubau et al., 2007). With repeated execution of such an abstract plan to produce movement sequences, the plan becomes less important as successive movements become associated into integrated motor representations (such as, e.g., motor chunks).

We assume that a central process, once prepared, is carried out by the central processor when all required input information is available (e.g., is active in short-term memory). Support for this idea comes from a study by Sackur and Dehaene (2009). These researchers had participants carry out two successive mathematical procedures (add two, then compare with five). Reaction time analyses indicated that the second process was sometimes triggered by availability of input to the first process, rather than that it started after the first process had produced output. This indicates that, first, both processes could be simultaneously activated, possibly because they involve different neural structures, and, second, that they were both triggered by the availability of the required information in short-term memory (for similar findings, see Sudevan & Taylor, 1987). Sackur and Dehaene (2009) argued that consciousness is required to make sure that the prepared processes are used in the proper order. That two central processes may be simultaneously active is further suggested by findings with the Psychological Refractory Period task showing crosstalk when the two successive tasks use the same type of input information (Hommel, 1998; Pashler, 1994).

Processors process information independently at each of the three levels. It is clear, though, that these processors must to some extent interact with each other. For instance, it has been shown that early visual processing is influenced by top-down control (Rauss, Schwartz, & Pourtois, 2011), and that motoric chunking patterns are influenced by stimulus features (Boutin, Massen, & Heuer, 2013; Jiménez, Méndez, Pasquali, Abrahamse, & Verwey, 2011). These and other results suggest that interactions between processors occur because the central processor controls perceptual and motor processors by preparing them for a particular task. Furthermore, the representations that develop at the central and motor processing levels are adjusted to processing at other levels. For instance, coding stimuli with colors in a serial RT task stimulated the use of motor chunks, but the subsequent removal of the colors caused participants to go back to a none-by-one response mode (Jiménez et al., 2011).

On the basis of these findings, we argue for C-SMB that (1) the executive task of preparing cognitive processes is carried out by the central processor, that (2) several prepared, central processes may be simultaneously active and wait for input, and that (3) already prepared processes start processing as soon as the required input information is available. So, central and perhaps also motor processes start when all required input information is available in short-term memory and in the motor buffer, respectively. At the level of the perceptual processors, once prepared, processing is probably triggered by sensory information that has passed a preset attentional filter (Baddeley, Allen, & Hitch, 2011).

Conclusions on sequence performance strategies

The C-SMB architecture assumes that motor sequence learning can occur at three processing levels using different types of coding. It stresses flexibility in the way in which movement sequences are being represented and executed. We argue that sequence execution may continue to rely on external guidance in which case responding gets faster because the stimulus and response orders are fixed. Then processes at any level may be primed by processes and feedback that underlie the production of preceding responses. Sequence execution may also involve an internally guided mode (Keele, 1968). In that case, movement order can be represented by a central-symbolic representation in short-term memory—a (possibly verbal) plan—that is read to deduce each individual movement, and also by a motor representation in the motor buffer—e.g., a motor chunk—that can be directly used by the motor system to produce each movement. In the case these two types of representations co-exist, the central and motor processors may be racing to trigger each sequence element. With extensive practice, the motor representation is fine-tuned for the dynamics of the effectors so that control eventually becomes effector-dependent and allows co-articulation. As motor coding is efficient and yields fast execution, this type of coding is likely to eventually become dominant and overshadow other representations (unless execution is instructed to be slow, like in some imaging studies, or when executing each sequence element takes considerable time).

The flexibility to account for the different ways of producing movements and movement sequences, makes it hard to find empirical support for C-SMB. Indeed, general processing frameworks and architectures like C-SMB are explicitly meant to account for many tasks and processing modes across different task domains (like production rule frameworks, e.g., Anderson, 1990; and the TEC framework, Hommel et al., 2001). This is why we prefer to speak of a framework instead of a model. An explicit prediction of our framework, however, is its assumption that information processing involves autonomous perceptual, central, and motor neural processors. This assumption may receive support from neurocognitive research.

The notion that sequencing skill may involve various types of abstract and concrete motor representations, and the flexibility of skilled performers to strategically change between execution modes, suggests that a sequencing skill can be utilized in various tasks at the cost of only limited performance decrement because there often is a suitable representation. Researchers interpreting performance and neurophysiological results of movement sequence studies should realize that their results may be based on a potentially task-dependent mixture of execution modes, rather than on one particular execution mode.

Existing models of motor behavior

In this section, we consider a diverse set of cognitive accounts of sequential movement production. These include the Dual System model for the Serial RT task (Keele et al., 2003), Sternberg et al.’s (1978) Subprogram-Retrieval model, Verwey’s (2001) Dual Processor Model, Schmidt’s (1975) Schema theory, and Rosenbaum’s Hierarchical Editor (1985), Parameter Remapping (1986), and Goal Posture (1995, 2001) models. We argue that these models all fit the proposed C-SMB framework.

External sequence control: speeding up reactions

We first address the serial RT task (Nissen & Bullemer, 1987; for an early version, see Bahrick, Noble, & Fitts, 1954). In this task, participants typically react to 12 key-specific stimuli that are displayed in a fixed order. Such a reaction series is repeated without interruption (for reviews, see Abrahamse et al., 2010; Keele, Ivry, Mayr, Hazeltine, & Heuer, 2003). Initially, the participants are said to produce the sequence in the reaction mode (Verwey, 2003b; Verwey & Abrahamse, 2012). In the course of practice, participants gradually respond more rapidly to each stimulus. Still, they often claim not to be aware of a stimulus or response order. The increased response rate found with these unaware participants is attributed to the development of associations between representations involved in responding. These allow the sequence to be carried out in the so-called associative mode (Verwey & Abrahamse, 2012; Verwey & Wright, 2014). In line with the C-SMB assumption of processors operating at three levels, associative learning in the serial RT task has been argued to develop at the perceptual, central and motor processing levels (Abrahamse et al., 2010). Associations at each of these levels make up a processing level-specific representation that primes processes and representations that are required for producing the next responses in the sequence. This priming may concern stimulus features, verbal, egocentric and allocentric spatial, and motor representations (Abrahamse et al., 2010; Goschke & Bolte, 2012).

According to the Dual-System Model of learning in the serial RT task (Keele et al., 2003), unidimensional sequence learning at the perceptual level involves priming at the level of individual stimulus features, like colors, locations, shapes, and tone pitches by the earlier processing of each of these features. So, this is a clear form of stimulus–stimulus learning (Abrahamse et al., 2010). A somewhat different type of sequence learning at the perceptual level is based on associations between the perceived feedback of a movement and the ensuing stimulus. This is referred to as response–stimulus learning (Ziessler & Nattkemper, 2001). This type of sequence learning is actually also based on stimulus–stimulus learning because of its reliance on associations between feedback and imperative stimuli.

At the central processing level, implicit associative learning in the serial RT task is based on associations between similar types of representations that are used for successive responses (Abrahamse et al., 2010; Koch & Hoffmann, 2000; Willingham et al., 2000; Witt et al., 2008). In the serial RT task, performance is assumed to involve successive response locations in an egocentric, effector-independent code (A. Cohen, Ivry, & Keele, 1990; Grafton, Hazeltine, & Ivry, 1998; Keele, Jennings, Jones, Caulton, & Cohen, 1995; Willingham et al., 2000). Furthermore, sequence learning at the central level is attributed to associations between successive stimulus–response compound representations, like stimulus–response mappings (Deroost & Soetens, 2006; Schwarb & Schumacher, 2010, 2012).

Participants who cannot verbalizable the response order are said to have implicit sequence knowledge and to lack explicit sequence knowledge. We argue that the capacity to write down a sequence does not necessarily imply the availability of a full-fletched verbal representation. Recent research with the Discrete Sequence Production task showed that most so-called aware participants often claim to have constructed explicit sequence knowledge using implicit sequence knowledge (Verwey & Abrahamse, 2012; Verwey et al., 2010; Verwey & Wright, 2014). When filling out an awareness questionnaire, they consciously develop and test hypotheses by (physically or mentally) replaying the movement pattern (Rünger & Frensch, 2008). Verbal reproduction (‘awareness’) may, thus, be associated more with the ability to translate implicit sequence knowledge into other representations than with having an actual verbal representation (Postle, 2006; Schwager, Rünger, Gaschler, & Frensch, 2012). The idea that people are hardly aware of their actions fits claims from other research domains (for a review, see Jeannerod, 1999). For sequence execution, it is probably more important to know that one has implicit sequence knowledge than that one can verbalize all elements of that sequence (Dienes & Scott, 2005).

The usefulness of explicit sequence knowledge lies especially in the possibility to transfer sequence knowledge to situations in which motor representations themselves are not useful, like when another effector is being used or the spatial layout is changed (see studies such as Grafton et al., 1998; Keele et al., 1995; Willingham, Nissen, & Bullemer, 1989). Various serial RT task studies observed that participants with explicit knowledge execute the practiced sequence more rapidly (Tubau et al., 2007), but this is not always observed. The beneficial effect of awareness may well be limited to sequences that are executed only moderately fast (Verwey & Wright, 2014). With high execution rates, it probably takes too much time to develop, and later apply, explicit sequence knowledge of individual sequence elements (Cleeremans & Sarrazin, 2007; Rünger & Frensch, 2008).

Sequence learning in the serial RT task has been shown to also involve a slowly developing response–response type of learning. This learning may and may not be effector-specific (Abrahamse et al., 2010). Effector-specific sequence learning is indicated by performance reduction when another effector (finger or hand) is used than during practice, and that has been observed in many different motor tasks (Doya, 2000; MacNeilage, 1970; Mattys, 2004; Park & Shea, 2003; Shea & Wulf, 2005; Sosnik, Hauptmann, Karni, & Flash, 2004), including the serial RT task (Deroost, Zeeuws, & Soetens, 2005; Keele et al., 1995; Verwey & Clegg, 2005). In the serial RT task, effector-specific learning has even been demonstrated by the capacity to learn a sequence with the fingers of one hand while the other hand performs a random sequence (Berner & Hoffmann, 2009).

Co-articulation involves the smoothened execution of a movement sequence because one movement is initiated before the previous one has been completed. We know of no keying studies demonstrating co-articulation, but it is a common finding in everyday typing and piano playing that one finger often starts moving towards a particular key before the preceding key has been depressed by another finger (Engel, Flanders, & Soechting, 1997; Fowler, 1981; Jordan, 1995; Kent & Minifie, 1977; MacNeilage, 1970; Mattys, 2004; Shaffer, 1975). We attribute effector-specific sequence learning and co-articulation to the motor processor. The reason is that both phenomena involve efficient use of the biomechanic properties of the limbs used (Park & Shea, 2003; Verwey & Clegg, 2005; Verwey & Wright, 2004), and because this type of motor learning is characterized by a much slower development than typical for sequence learning in the serial RT task (cf. Hikosaka et al., 1999).

Hence, in the case of successively selected movements—as in the serial RT task—sequential movement skill may be based on priming at each of the three processing levels that C-SMB distinguishes. We refer to this type of learning as associative sequence learning (Verwey & Abrahamse, 2012; Verwey & Wright, 2014). It is characterized by skilled sequence execution that still relies heavily on guidance by external stimuli (J. Cohen & Poldrack, 2008; Jiménez, 2008), like when skillfully playing a familiar piano piece from sheet music.

Internal sequence control: various processing strategies

In the case of movement sequences performed in response to external stimuli, participants may improve performance by reducing and even eliminating their reliance on guidance by movement-specific stimuli (Goldberg, 1985; Hikosaka et al., 1999; Tubau et al., 2007). The required movement representations may develop at various levels of processing, and these representations differ to the extent that movement details are pre-specified. Consequently, a particular movement series can be produced using various alternative internal models (Hirashima & Nozaki, 2012). This flexibility to change to another processing strategy—i.e., another execution mode—was recognized long ago (e.g., Wickelgren, 1969). It is assumed to underlie both skill and behavioral flexibility in sequencing tasks (MacKay, 1982), and constitutes one of the core features of C-SMB. It is this flexibility that allows our framework to accommodate various existing models of movement sequence production. We here discuss various discrete sequencing research paradigms and how they fit C-SMB.

Discrete keying sequences

Many discrete sequence studies use key pressing sequences because, apart from the simple implementation of these tasks, the simple nature of key presses make the observed response times a good indicator of the underlying control processing (Rhodes et al., 2004). In the Discrete Sequence Production task (described in the earlier Dual Processor Model section), the assumption, obviously, is that the sequence control mechanisms that have been unveiled by studies of keying sequences are also responsible for the production of sequential real world skills. We here argue that executing discrete keying sequences is based on central-symbolic representations in short-term memory from which successive movements are deduced during sequence execution (Ruitenberg et al., 2012a; often called a plan, e.g., Tubau et al., 2007), and/or on representations in the motor buffer that can be immediately executed by the motor processor.Footnote 6

Studies of keying sequences indicate that central-symbolic sequence representations in short-term memory (‘plans’) can take various forms. They may be constructed on the basis of a short series of spatial stimuli that are first displayed in a go/no-go task (De Kleine & Van der Lubbe, 2011; Ruitenberg et al., 2012a). If the sequence involves a particular regularity, performers may develop (verbal) rules in short-term memory to represent the sequence (Jones, 1981; Postle, 2006; Povel & Collard, 1982; Restle, 1970; Rosenbaum et al., 1983; Simon, 1972). For instance, pressing the sequence 12121212 is probably learned in terms of a rule saying that 12 is to be repeated four times. Typing the sequence 12344321 may be represented by a verbal representation to go from 1 to 4 and then back, or a spatial representation indicating the start, reversal and end positions of the keys. The rules in short-term memory may involve operations like Transpose, Repeat, and Mirror (Restle, 1976). If there is no apparent regularity in the sequence, as with most phone numbers, participants can learn the order of stimuli in a verbal or spatial code.Footnote 7 Research on the reproduction of relatively unfamiliar series of elements (e.g., digits or letters) indicate that sequence knowledge tends to be reproduced in terms of successive three- and four-element groups (e.g., Wickelgren, 1967; for a review, see Fendrich & Arengo, 2004). All these sequence control strategies have in common that individual movements are derived by the central processor from parsing a short-term memory representation in a cognitive loop during which each next element is determined by the central processor and then passed on to the motor processor (Figs. 2 and 3). Transforming a representation in short-term memory into successive movements induces substantial central processor load.

Instead of immediately executing each movement using a representation in short-term memory, representations of successive movements may be first collected in the motor buffer. Only after buffer loading has finished is the entire movement sequence executed. This is probably what happens in a sequence go/no-go task where a stimulus series is displayed before the indicated keying sequence is produced (De Kleine & Van der Lubbe, 2011; Ruitenberg et al., 2012a), and when short unfamiliar sequences are known (i.e., stored verbally) and programmed beforehand (De Kleine & Verwey, 2009; Sternberg et al., 1978; Verwey, 1996; Verwey, Lammens, & van Honk, 2002). This execution mode distinguishes itself from reading central-symbolic representations by high execution rates and limited dual task interference during execution.

If a movement sequence is repeatedly carried out, a motor representation or chunk develops in long-term memory that can later be loaded by the central processor into the motor buffer in a single processing step (Verwey, 2001). Support for this motor chunk notion has been reported in a variety of serial movement tasks. like the serial flexion-extension task, typing, morse code production, uttering nonsense words, writing, and moving a pen through a maze (Rhodes et al., 2004).Footnote 8 Like the memory chunks proposed for cognitive tasks (Halford, Wilson, & Phillips, 1998; Miller, 1956), the benefit of motor chunks is that they eliminate the need to each time construct a representation to execute the sequence. The development of motor chunks has been argued to be responsible for the reduction with practice of the effect of sequence length on sequence initiation time (Immink & Wright, 2001; Klapp, 1995; Verwey, 1999; Wright, Black, Immink, Brueckner, & Magnuson, 2004).Footnote 9 Findings of a stimulus-sequence reversal effect (Verwey, 2001), and the possibility of selecting a sequence using the anticipated effect of the sequence as a whole (Keller & Koch, 2006, 2008), support the notion that a movement sequence is coded in a single representation that can be selected as if it were the representation of a single movement (Verwey, 1999).

Irrespective of whether the motor buffer is loaded with a motor representation in a single step (i.e. activating a robust motor chunk in long-term memory) or in a series of successive steps, its content serves the rapid execution in the so-called chunking mode (Verwey, 1996; Verwey & Abrahamse, 2012). This mode involves the successive search and retrieval of information for each next movement by the motor processor in the motor loop (Fig. 3).

In contrast to sequence execution on the basis of a short-term memory representation, execution on the basis of the motor buffer content does not load the central processor. However, the motor buffer has a capacity limited to only 3–5 movements (Bo & Seidler, 2009; Verwey & Eikelboom, 2003; Verwey et al., 2002). With longer sequences, execution requires concatenating successive subsequences (Acuna et al., 2014; Wymbs et al., 2012). In the case where these subsequences are executed in an unfamiliar order, selecting and initiating each next motor chunk is still a central processor task. However, when subsequences are executed in a familiar order, selecting each next motor chunk is automatic and the central processor is no longer needed (Verwey et al., 2010, 2014). In either case, the transition from one to the next subsequence is in keying sequences indicated by a relatively slow movement (Bo & Seidler, 2009; Kennerley et al., 2004; Verwey et al., 2009, 2010, 2014), though with extensive practice this might reduce (Acuna et al., 2014). Notice, however, that indications for the use of subsequences may also be concealed by individual differences. Participants may use subsequences of different lengths in a particular sequence so that there is no clear concatenation point when analyzed at the group level (Acuna et al., 2014; Verwey, 2003a; Verwey & Eikelboom, 2003; Wymbs et al., 2012). Moreover, participants treated as a homogenous group may perform in different execution modes. For instance, children and older participants did not always seem to be using subsequences, and this introduced large individual differences in these groups (Ruitenberg, Abrahamse, & Verwey, 2013; Verwey, 2010; Verwey, Abrahamse, Ruitenberg, Jiménez, & De Kleine, 2011). Also, young adults appeared to be able to strategically switch to another execution mode (Jueptner et al., 1997), and it is not clear whether they perhaps do this during an experiment.

The flexibility of C-SMB to account for different strategies implies that this framework allows for several alternative accounts in a particular task. For example, inter-movement times in some keying sequences may be accounted for by the central processor traversing a hierarchical representation in short-term memory to determine each ensuing key press (Rosenbaum et al., 1983). The results may, however, also be accounted for by the central processor successively selecting motor chunks using a central-symbolic plan, each of which is then executed by the motor processor. Also, the central processor may successively construct during sequence execution plans for oncoming movements in short-term memory, each of which yielding temporary motor chunks in the motor buffer.

In conclusion, studies with discrete keying sequences have been explained by the Dual Processor Model (Verwey, 2001). Like the Dual Processor Model, C-SMB can account for discrete keying sequences. However, C-SMB explicitly emphasizes the flexibility to use different strategies to execute discrete movement sequences, and to use different types of movement representations—either simultaneously or alternatingly. Understanding the responsible execution strategy in a particular sequencing task becomes especially difficult with relatively long and unfamiliar sequences, as these may involve substantial individual differences in strategy and representations used.

Sequences of aiming movements

A classic model of how movements are planned and executed is Schema theory. Schema theory assumes two representations, an abstract General Motor Program and a Recall Schema (Schmidt, 1975; Schmidt & Lee, 1999; for a review, see Shea & Wulf, 2005). The General Motor Program defines a class of movements with invariant features such as the sequencing of submovements, relative timing, and relative forces. The Recall Schema contains schemata to determine movement parameters like speed, size, and muscle group/effector to scale the General Motor Program into an executable motor program which is suited for a particular situation. The General Motor Program is formed first, and only then can a stable Recall Schema develop (Shea & Wulf, 2005). Even though Schema theory was originally formulated to support fast open loop movements (i.e., feedback is not used to adjust ongoing movements), Shea and Wulf (2005) argued that feedback plays a superficial role in influencing the production of slower movements, and that therefore motor programs constitute a good description for the planning and execution of slower movement sequences too.

While the order in which movement parameters are specified was originally thought to be arbitrary (Rosenbaum, 1985), EEG research suggests that absolute timing is specified before muscle group/effector (Leuthold & Jentzsch, 2011). Force would be specified last, after the other parameters (Shea & Wulf, 2005). Schema theory suggests that the force and timing parameters hold for the movement sequence as a whole. However, the parameter remapping phenomenon (Rosenbaum, Weber, Hazelett, & Hindorff, 1986) indicates that these parameters may be specified separately for each individual movement, and that this temporary binding of parameters to individual movements remains active for some time after movement completion. Also, movement parameters may be derived directly from the perceived stimuli (direct parameter specification; Neumann, 1990). This occurs especially when there is feature overlap between stimuli and response movements, such as when an effector is spatially cued to move to a particular location (Adam & Pratt, 2004; Diedrichsen et al., 2001). In Fig. 2, the possibility for the direct control of movement by stimulus features is reflected in the overlap between short-term memory and the motor buffer.

A major distinction between the concepts of a motor program (Shea & Wulf, 2005) and of motor chunks (Verwey, 1999) is that motor program construction involves two processing stages: program loading and parameter specification. Instead, loading a motor chunk into the motor buffer would involve a single processing stage, and specification of movement parameters like force seems unnecessary because these parameters have either been specified during preparation or have already integrated in the motor chunk representation. Consistent with this notion of in-built movement parameters, indications have been reported in the motor programming literature that, after extensive practice and when stimuli are directly mapped with responses, movement parameters can be associated with the General Motor Program so that these need not be specified separately (Goodman & Kelso, 1980; Keetch et al., 2005).

In terms of C-SMB, we argue that during the response selection stage a representation of the intended effect of the action is loaded by the central processor into short-term memory (Hommel et al., 2001). During the programming stage, this information allows a General Motor Program representation to be also loaded in short-term memory (Schmidt, 1975). This allows the central processor to specify movement order, relative timing, and relative force which are then loaded into the motor buffer. During the parameter specification stage, the central processor uses the Recall Schema to further determine for the current situation the movement parameters absolute timing, muscle group/effector, overall force (Schmidt, 1975). These, too, are then loaded into the motor buffer. After this parameter specification stage has been completed, the motor buffer contains all the information needed to execute the movement sequence. This triggers the motor processor to start movement execution.

Posture-based motion planning

In order to develop a unified theory of the planning and control of action, Rosenbaum, Loukopoulos, Meulenbroek, Vaughan, and Engelbrecht (1995) developed the posture-based motion planning model (see also Rosenbaum et al., 2009; Rosenbaum, Meulenbroek, Vaughan, & Jansen, 2001). This model was meant to show how the information-processing system solves the inverse kinematics problem. This is that each target end posture of the body, defined in terms of a set of joint angles, may be reached using an infinite set of solutions. The posture-based motion model postulates that given a particular behavioral goal, (1) remembered goal postures are evaluated for their suitability for the task at hand according to a requirement hierarchy, and (2) the most suitable goal posture is then adjusted for the situation at hand. This evaluation and adjustment involves internal simulation of the required movements to the goal posture. Once a goal posture has been selected, the required movements are (3) specified and (4) executed to attain the selected body posture. This model assumes that movements are stored in terms of goal postures, and that the motor system autonomously finds some way to reach that goal posture.

The posture-based motion planning model fits C-SMB in that selecting a suitable goal posture is done by the central processor during the response selection stage. That processing stage is then followed by loading representations of the associated muscle innervations in the motor buffer during the motor programming and parameter specification stages. Internal simulation probably leads to estimating the action effects on the basis of the current motor buffer content by way of associations that have previously developed during similar experiences. On the basis of these expected action effects, the central processor adjusts the goal posture and required movements while cycling through the selection and programming stages in the cognitive loop (Fig. 3). Eventually, the motor buffer is judged to contain the optimal movement specification and execution by the motor processor commences. So, this model implies that evaluation and fine tuning of a movement sequence occurs in terms of body postures. Interestingly, the posture-based motion planning model explicitly assumes that selecting and planning of goal postures may occur while ongoing movements are being carried out. This is consistent with C-SMB’s two-processor assumption if we assume that body postures are expressed in motor coordinates in the motor buffer.

Coding movement sequences

The multitude of processing strategies accounted for by C-SMB is consistent with the many indications that movement representations may be coded in very different ways, such as verbal, egocentric spatial, allocentric spatial, goal postures of effectors, and joint angles (Andresen & Marsolek, 2012; Bapi, Doya, & Harner, 2000; Berniker, Franklin, Flanagan, Wolpert, & Kording, 2013; De Kleine & Verwey, 2009; Hikosaka et al., 1999; Panzer, Gruetzmacher, Fries, Krueger, & Shea, 2011; Shea et al., 2011; Verwey & Abrahamse, 2012; Verwey et al., 2010). Indeed, recent sequencing studies with the flexion-extension task (Kovacs, Muhlbauer, & Shea, 2009; Panzer et al., 2011), the serial RT task (Goschke & Bolte, 2012; Tubau et al., 2007), and with the discrete sequence production task (Verwey & Abrahamse, 2012; Verwey & Wright, 2014), all indicate that with practice several of these movement sequence representations develop concurrently (Berniker et al., 2013).

Research with discrete sequence production tasks further indicates that the execution of familiar movement sequences involves contributions of central-symbolic representations in short-term memory and motor representations in the motor buffer in that two processors may race to trigger each response (Ruitenberg et al., 2012a; Verwey, 2001; Verwey et al., 2010). The relative contribution of each representation to sequence execution probably depends on individual differences, amount of practice (e.g., motor coding develops relatively slowly; Kovacs et al., 2009b), type of deviation from the original task, and the number of sequence elements (longer sequences rely more on visual–spatial than on motor coding; Kovacs, Han, & Shea, 2009). In terms of C-SMB, these findings indicate that, as with discrete keying sequences (Verwey, 2003b; Verwey et al., 2014), the cognitive and motor processors may race to trigger the individual sequence elements in other tasks. It is not clear, though, whether the central and the motor processors themselves can also make use of several representations simultaneously, or that each of them can use just one representation at the time.

A classification of processing strategies

In this section, we briefly discuss the processing strategies (i.e., sequence execution modes) that are accounted for by C-SMB when producing a particular movement sequence. Figure 4 shows an overview.

Fig. 4
figure 4

A classification of strategies to execute movement sequences. It is based on the C-SMB assumption that a movement sequence can be controlled by the central processor selecting individual movements that are immediately executed by the motor processor, or by the central processor loading up to 4 or 5 movements into the motor buffer before they are executed by the motor processor. Indices coincide with those in the text

A first class of execution modes involves the motor processor immediately executing each movement that is stored by the central processor into the motor buffer (A in Fig. 4). This strategy is characterized by reduced sequence execution rate when another task requires the central processor (i.e. task interference). It may involve the central processor selecting individual movements on the basis of a series of external stimuli, like in the serial RT task (AA). In the case of unfamiliar sequences, when there are no associations between successive movements, sequence execution is purely stimulus driven (i.e., the reaction mode; Verwey & Abrahamse, 2012) (AAA). With practice, associations develop at perceptual, cognitive and/or motor levels so that successive movements are facilitated when they are produced in a familiar order (AAB; i.e., the associative mode). Motor chunks do not play a role in this execution mode (Jiménez et al., 2011).

One-by-one movement execution may also be used when the central processor cycles through a few central processes in the cognitive loop (Figs. 2 and 3) to determine each next movement (AB). This can be done on the basis of some central-symbolic representation in short-term memory, like when a verbal description of successive movements is interpreted. Depending on the central-symbolic representation, inter-movement times may be observed that are accounted for by the notion that a processor traverses a hierarchical tree representation (Povel & Collard, 1982; Rosenbaum et al., 1983). The notion that sequencing skill can be based on associations at both the cognitive and motor processing levels suggests that, in that situation, a further distinction can be made between the use of some central-symbolic representation while there is no association between successive movements at the motor level (ABA), and the use of such a rule while associations between motor level representations allow the ensuing movements to be primed, such as when a serial RT sequence is carried out in associative mode in the presence of explicit sequence knowledge (ABB). In the latter situation, different representations are likely to contribute to sequence execution and indications for different strategies may be observed.

Alternatively, when a task involves the execution of bursts of short movement series, the central processor first loads up to about four or five movement representations into the motor buffer (B). The motor processor commences execution when all movements have been fully specified by the central processor. In this pure motor processor-based execution situation, cognitive interference with other tasks is limited to the preparation of the short movement sequence and does not occur during execution (Verwey et al., 2010). In the case of unfamiliar movement series, the sequence representation may first be constructed by the central processor during the cognitive loop (Figs. 2 and 3) by successively loading features of the movement sequence into the motor buffer (BA). This buffer loading process can be based on the advance display of a series of stimuli in a go/no-go task (BAA; e.g., De Kleine & Van der Lubbe, 2011; Ruitenberg et al., 2012a; Sternberg et al., 1978). Buffer loading may also be based on the central processor scanning and interpreting an abstract representation in short-term memory to determine which movements are to be loaded into the motor buffer (BAB; Ruitenberg et al., 2012a). That is, the motor plan is constructed in the motor buffer while cycling through the cognitive loop, and execution involves reading and executing the individual responses from the motor buffer and the motor loop. In both these cases, execution starts only after the motor buffer has been fully loaded.

With familiar movement series of a limited length, the central processor retrieves an existing sequential movement representation from long-term memory, and loads it into the motor buffer in a single processing step (BB). If all movement features have already been integrated into this long-term memory representation, the motor processor can immediately start executing the motor buffer content (BBA). This happens, for instance, in the Discrete Sequence Production task when motor chunks are loaded into the motor buffer (Abrahamse et al., 2013; Verwey, 2001), and also when parameters of individual movements have been integrated into a Generalized Motor Program (Keetch et al., 2005). If movement parameters have not been integrated with the movement representation due to limited practice or variability during practice, the central processor specifies these parameters by loading them during a separate (parameter specification) stage into the motor buffer (BBB). Then, these parameters may be specified using either a Recall Schema developed during earlier experiences (BBBA) (Rosenbaum et al., 1986; Schmidt, 1975), or by allowing stimulus features to directly specify the lacking information without further cognitive processing (BBBB) (Adam & Pratt, 2004; Diedrichsen et al., 2001; Neumann, 1990).

In all the situations in which sequential behavior involves the rapid execution of motor buffer-based movement series, executing longer movement sequences involves the successive execution of relatively short motor sequences. Control may then involve the central processor selecting forthcoming sequence representations while the motor processor is engaged in executing the motor buffer content (i.e., concurrent preparation; Verwey, 2001). One needs to remember, though, that in that situation different participants may use segments of different lengths so that, across participants, concatenation points remain hidden (Verwey, 2003a; Verwey & Eikelboom, 2003). Ways to assess individual differences in longer sequences have recently been proposed by Acuna et al. (2014) and Wymbs et al. (2012). The present classification bears a methodological caution in that researchers should realize that the instruction to reduce execution rate (as used in some imaging studies) introduces a mixture of processing strategies in highly trained participants, and thus activate other neural regions.

C-SMB in perspective

A cognitive model is a description of how the neural information processing system behaves in a particular situation. This explains, on the one hand, that assumptions of cognitive models may hold in some, and not in other task domains. On the other hand, commonalities across various cognitive models may point to additional constraints of the neural system that can be used for developing neural processing models. A case in point is the notion that information is processed in successive steps. This seems a property of the neural information processing system that holds in many tasks. Inspired by a few well-known cognitive models of reaction time studies, we proposed a cognitive processing architecture, C-SMB. This architecture is aimed at explaining how the massively parallel neural information-processing system is able to process information in successive steps, and we worked this model out for the case of serial motor behavior.

C-SMB assumes that the neural system generally behaves as if it involves relatively autonomous modality-specific input and output processors that are connected by a pool of central processing resources. These central processing resources usually function like a single, versatile processor that performs many different cognitive processes in task-dependent processing stages. In the case of reaction time tasks, this central processor processes a stimulus representation that is provided by a perceptual processor, and transmits the outcome to a motor processor. The exchange of information between these processors—via C-SMB’s short-term memory and motor buffer—may be implemented as neural representations that are accessible to all processors in a joint workspace (Baars, 1988; Baars, Franklin, & Ramsoy, 2013).Footnote 10

Sometimes, the central processing resources may also behave like two parallel cognitive processors. We speculate that this is especially likely in tasks in which output of a perceptual processor can be directly used by an already set motor processor, as with direct parameter specification when, for example, a location feature of a stimulus is directly used as parameter for an otherwise already prepared movement (Adam & Pratt, 2004; Diedrichsen et al., 2001; Neumann, 1990). In C-SMB, this direct perceptual–motor link is indicated as the overlap between short-term memory and the motor buffer (Fig. 2) that allows the central processor to be bypassed (after it has first set the system to allow this to happen).

Neural underpinnings

A neural interpretation of the C-SMB cognitive architecture is that the perceptual and motor processors consist of networks of cortical and subcortical regions that are functionally quite separate from the networks that make up the central processor. So, indications for the perceptual and motor processing stages are attributed to relatively autonomous and encapsulated neural processing systems (Sternberg, 1998, 2001). The primary perceptual and motor cortical areas might form the interface between modality-specific cortico-subcortical networks, each of which behaves as a perceptual or motor processor, and the widely distributed set of brain regions that together make up C-SMB’s central processor (Donner & Siegel, 2011). The neural autonomy of the perceptual and motor systems is corroborated by indications that they use system-specific coding such as labeled lines and different neural codes and frequencies to communicate (Boraud, Brown, Goldberg, Graybiel, & Magill, 2005; Castelo-Branco, Neuenschwander, & Singer, 1998; van Wijk, Beek, & Daffertshofer, 2012). Also, these peripheral systems have only a limited number of connections to the central neural systems (possibly a consequence of the so-called neural small-world topologies; Ferrarini et al., 2009).

The idea that the central processor is based on a brain-wide set of regions is closely related to the notion of a Global Workspace (Baars, 1988; Baars et al., 2013). The Global Workspace involves activity of process-dependent cortico-subcortical networks, each of which occupying a limited number of regions distributed across the brain. For instance, the well-known response selection stage has been argued to involve a neural network consisting of the prefrontal cortex (that probably represents the intended end state in short-term memory; Cisek & Kalaska, 2010), the temporal lobe (representing object identity), and the orbitofrontal cortex (representing the subjective value of the action). C-SMB suggests that the Global Workspace usually performs one processing stage at a time. This is controlled by the prefrontal cortex connecting some and disconnecting other areas. The output of each processing stage remains active as an activation pattern across the responsible network until it can be used as input by the next processing stage.

That each different cognitive process involves a unique functional integration of brain regions is in line with meta-analyses of large numbers of brain imaging studies. One such study examined 1840 fMRI studies (Laird et al., 2011). The results of that study suggested that, across many tasks, there are general processing networks for (1) visual perception, (2) higher cognitive processing, (3) motor and visuospatial integration, coordination, and execution, and (4) for emotional and interoceptive processing. In the case of a particular task, however, it is likely that only very specific regions within each of these networks will be active. This depends on the features that make up the representation. An important observation was that, across all these studies, sequence recall and motor learning were in the special situation that this class of tasks strongly mapped to two, otherwise independent, networks. One included primary sensorimotor cortices for upper extremities (including M1-S1), the other a premotor/SMA network. This observation is in line with our core notion that movement sequences can be carried out in at least two different processing strategies.

The important role of C-SMB’s central processor appears comparable to that of the prefrontal cortex. This neural structure, possibly aided by the basal ganglia (Stocco, Lebiere, & Anderson, 2010), organizes activity in regional networks by adjusting the functional connectivity across multiple cortical regions (Elsinger, Harrington, & Rao, 2006; Gelnar, Krauss, Sheehe, Szeverenyi, & Apkarian, 1999; Liuzzi et al., 2010; Roland, Larsen, Lassen, & Skinhoj, 1980). Furthermore, it also controls sensory input via the thalamus (Brunia, 1999). The capacity to control connections elsewhere in the brain probably enables the prefrontal cortex to temporarily bind feature-specific cortical areas in the service of short-term memory, and to control neural structures that are responsible for long-term memory. Those structures include the hippocampus (essential for episodic memory) and the basal ganglia (essential for procedural memory; Ashby & Crossley, 2012; Kumaran & McClelland, 2012; Postle, 2006). Only with extensive practice, does the control of the prefrontal cortex over long-term memory reduce as direct cortico-cortical connections develop that are responsible for semantic memory and motor skills. Moreover, the ability to control functional connectivity in the brain enables the prefrontal cortex to drive the transition from one to another activity pattern across the global workspace. At the cognitive level, this transition may involve the central processor translating an input into an output pattern during a processing stage.

Support for the C-SMB framework may in the future come from indications for functionally independent perceptual, central and motor processing neural systems. Also, support could come from (further) indications that the different sequence execution modes, that have been distinguished by behavioral research (Verwey & Abrahamse, 2012; Verwey & Wright, 2014), are associated with different regional activity patterns (like when participants are instructed to attend to what they are doing; Jueptner et al., 1997). With respect to the discrete sequence production task, for example, Abrahamse et al. (2013) speculated that the reaction mode would be associated with a network including the striatum and premotor cortex, the associative mode with the sensorimotor–premotor cortex loop, and the chunking mode with the sensorimotor–supplementary motor area loop.

Methodological ramifications

The flexibility assumed by C-SMB to account for the various ways in which a movement sequence can be controlled and executed comes with a caveat: sequencing tasks are likely to involve a mixture of processing (and execution) strategies that differs across participants and sometimes even within a single participant (e.g., following an error; Jentzsch & Dudschig, 2008; Notebaert et al., 2009). In behavioral studies, one might distinguish processing strategies by assessing the transfer to other versions of a task (e.g., Shea et al., 2011), and examining interference with secondary tasks (as, e.g., in Stoet & Hommel, 1999). In brain imaging studies, different strategies can be distinguished by observing different activity patterns across the brain (cf. Jueptner et al., 1997). So, one should always be aware that the performance and brain activity pattern observed in a particular sequencing study can result from a mixture of processing strategies. Future research should aim at assessing behavioral and imaging data with ‘pure’ reaction, association, and chunking modes.

Conclusions

Given the many behavioral indications that processing often occurs in successive stages, we proposed the Cognitive framework for Sequential Motor Behavior (C-SMB). C-SMB is important for two reasons. First, it demonstrates that serial processing can be explained by a relatively simple cognitive architecture consisting of processing systems (‘processors’) at the perceptual, central, and motor level. This cognitive architecture seems to fit well with how the brain processes information. Second, C-SMB underscores the human flexibility to produce movement sequences in various modes. The overview of these mode in Fig. 4 should help to interpret the results of behavioral and brain imaging studies, and may help to understand individually different and task-specific activity patterns in the brain. The flexibility of C-SMB to account for so many different processing strategies comes with the caveat that it is difficult to find exclusive support for this framework. But, then again, flexibility is a property of the human brain that we also need to account for.